home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Gold Medal Software 3
/
Gold Medal Software - Volume 3 (Gold Medal) (1994).iso
/
stats
/
nonlin31.arj
/
NONLIN.DOC
< prev
next >
Wrap
Text File
|
1994-03-11
|
178KB
|
3,576 lines
N O N L I N
Nonlinear Regression Analysis Program
Phillip H. Sherrod
Member, Association of Shareware Professionals (ASP)
Nonlin allows you to perform statistical regression
analyses to estimate the values of parameters for
linear, multivariate, polynomial, logistic, exponential,
and general nonlinear functions. The regression
analysis determines the values of the parameters which
cause the function to best fit the observed data that
you provide. This process is also called "curve
fitting."
Nonlin allows you to specify the function whose
parameters are being estimated using ordinary algebraic
notation. In addition to determining the parameter
estimates, Nonlin can be directed to generate an output
file with predicted values and residuals. It can also
plot the data observations and the computed function.
Although designed for regression analysis, Nonlin can
also be used to find the root (zero point) or minimum
absolute value of a nonlinear expression. Nonlin is in
use at many engineering and research centers around the
world.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction to Regression Analysis . . . . . . . . . . . . 1
1.2 Introduction to Nonlin . . . . . . . . . . . . . . . . . . 2
1.3 Installing Nonlin . . . . . . . . . . . . . . . . . . . . . 4
2. Using Nonlin . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Statement Syntax . . . . . . . . . . . . . . . . . . . . . 6
2.2 Variables and Parameters . . . . . . . . . . . . . . . . . 6
2.3 Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Overview of Computation Process . . . . . . . . . . . . . . 8
2.5 Function Specification . . . . . . . . . . . . . . . . . . 9
2.5.1 Arithmetic Operators . . . . . . . . . . . . . . . . . 9
2.5.2 Numeric Constants . . . . . . . . . . . . . . . . . . 10
2.5.3 Symbolic Constants . . . . . . . . . . . . . . . . . 11
2.5.4 Built-in Constant . . . . . . . . . . . . . . . . . . 11
2.5.5 Built-in Functions . . . . . . . . . . . . . . . . . 11
2.6 Nonlin Command Files . . . . . . . . . . . . . . . . . . 16
2.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8 Include Files . . . . . . . . . . . . . . . . . . . . . . 16
2.9 Required Statements . . . . . . . . . . . . . . . . . . . 17
3. Nonlin Statements . . . . . . . . . . . . . . . . . . . . . 18
3.1 TITLE . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 VARIABLES . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 PARAMETERS . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 DOUBLE . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 CONSTANT . . . . . . . . . . . . . . . . . . . . . . . . 20
3.6 CONSTRAIN . . . . . . . . . . . . . . . . . . . . . . . . 20
3.7 SWEEP . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.8 FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . 21
3.9 CORRELATE . . . . . . . . . . . . . . . . . . . . . . . . 22
3.10 COVARIANCE . . . . . . . . . . . . . . . . . . . . . . . 22
3.11 CONFIDENCE . . . . . . . . . . . . . . . . . . . . . . . 22
3.12 TOLERANCE . . . . . . . . . . . . . . . . . . . . . . . 23
3.13 ITERATIONS . . . . . . . . . . . . . . . . . . . . . . . 23
3.14 ANGLETYPE . . . . . . . . . . . . . . . . . . . . . . . 23
3.15 OUTPUT . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.16 POUTPUT . . . . . . . . . . . . . . . . . . . . . . . . 24
3.17 PLOT . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.18 SPLOT . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.19 RPLOT . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.20 NPLOT . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.21 PRESOLUTION . . . . . . . . . . . . . . . . . . . . . . 31
3.22 WIDTH . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.23 NOECHO . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.24 Assignment Statement . . . . . . . . . . . . . . . . . . 32
3.25 IF Statement . . . . . . . . . . . . . . . . . . . . . . 32
i
Contents ii
3.26 WHILE Statement . . . . . . . . . . . . . . . . . . . . 33
3.27 DO Statement . . . . . . . . . . . . . . . . . . . . . . 33
3.28 FOR Statement . . . . . . . . . . . . . . . . . . . . . 34
3.29 BREAK Statement . . . . . . . . . . . . . . . . . . . . 34
3.30 CONTINUE Statement . . . . . . . . . . . . . . . . . . . 35
3.31 STOP Statement . . . . . . . . . . . . . . . . . . . . . 35
3.32 DATA . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4. Understanding The Results . . . . . . . . . . . . . . . . . 37
4.1 Descriptive Statistics for Variables . . . . . . . . . . 37
4.2 Parameter Estimates . . . . . . . . . . . . . . . . . . . 37
4.3 t Statistic . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Prob(t) . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Final Sum of Squared Deviations . . . . . . . . . . . . . 38
4.6 Average and Maximum Deviation . . . . . . . . . . . . . . 38
4.7 Proportion of Variance Explained . . . . . . . . . . . . 39
4.8 Adjusted Coefficient of Multiple Determination . . . . . 39
4.9 Durbin-Watson Statistic . . . . . . . . . . . . . . . . . 39
4.10 Analysis of Variance Table . . . . . . . . . . . . . . . 41
4.11 Correlation Matrix . . . . . . . . . . . . . . . . . . . 41
5. Theory of Operation . . . . . . . . . . . . . . . . . . . . 43
5.1 Minimization Algorithm . . . . . . . . . . . . . . . . . 43
5.2 Convergence Criterion . . . . . . . . . . . . . . . . . . 43
6. Hints for Nonlin Use . . . . . . . . . . . . . . . . . . . . 45
6.1 Convergence Failures . . . . . . . . . . . . . . . . . . 45
6.2 Singular Matrix Problems . . . . . . . . . . . . . . . . 46
6.3 Performance Issues . . . . . . . . . . . . . . . . . . . 46
6.4 Program Limits . . . . . . . . . . . . . . . . . . . . . 47
7. Example Analyses . . . . . . . . . . . . . . . . . . . . . . 48
8. Special Applications . . . . . . . . . . . . . . . . . . . . 52
8.1 Omitted Dependent Variable . . . . . . . . . . . . . . . 52
8.2 Root Finding and Expression Minimization . . . . . . . . 53
8.2.1 Function Minimization Examples . . . . . . . . . . . 55
9. Acknowledgement and Use of Nonlin . . . . . . . . . . . . . 56
9.1 Acknowledgement . . . . . . . . . . . . . . . . . . . . . 56
9.2 Use and Distribution of Nonlin . . . . . . . . . . . . . 56
9.3 Association of Shareware Professionals . . . . . . . . . 57
9.4 Copyright Notice . . . . . . . . . . . . . . . . . . . . 57
9.5 Disclaimer . . . . . . . . . . . . . . . . . . . . . . . 57
10. Other Software . . . . . . . . . . . . . . . . . . . . . . 59
10.1 Mathplot -- Mathematical Function Plotting Program . . . 59
10.2 TSX-32 -- Multi-User Operating System . . . . . . . . . 59
10.3 SIMSTAT -- Interactive Statistics Program . . . . . . . 60
11. Software Order Form . . . . . . . . . . . . . . . . . . . . 61
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Chapter 1
Introduction
1.1 Introduction to Regression Analysis
The goal of regression analysis is to determine the values of
parameters for a function that cause the function to best fit a
set of data observations that you provide. In linear regression,
the function is a linear (straight line) equation. For example,
if we assume the value of an automobile decreases by a constant
amount each year after its purchase, and for each mile driven, the
following linear function would predict its value (the dependent
variable on the left side of the equal sign) as a function of the
two independent variables which are age and miles:
value = price + depage*age + depmiles*miles
where 'value', the dependent variable, is the value of the car,
'age' is the age of the car, and 'miles' is the number of miles
that the car has been driven.
The regression analysis performed by Nonlin will determine the
best values of the three parameters, 'price', the estimated value
when age is 0 (i.e., when the car was new), 'depage', the
depreciation that takes place each year, and 'depmiles', the
depreciation for each mile driven. The values of 'depage' and
'depmiles' will be negative because the car loses value as age and
miles increase.
In a problem such as this car depreciation example, you must
provide a data file containing the values of the dependent and
independent variables for a set of observations. In this example
each observation record would contain three numbers: value, age,
and miles, collected from used car ads for the same model car.
The more observations you provide, the more accurate will be the
estimate of the parameters. The Nonlin statements to perform this
regression are shown below:
Variables value,age,miles;
Parameters price,depage,depmiles;
Function value = price + depage*age + depmiles*miles;
Data;
(data values go here)
Once the values of the parameters are determined by Nonlin, you
can use the formula to predict the value of a car based on its age
1
Chapter 1. Introduction 2
and miles driven. For example, if Nonlin computed a value of
16000 for price, -1000 for depage, and -0.15 for depmiles, then
the function
value = 16000 - 1000*age - 0.15*miles
could be used to estimate the value of a car with a known age and
number of miles.
If a perfect fit existed between the function and the actual data,
the actual value of each car in your data file would exactly equal
the predicted value. Typically, however, this is not the case,
and the difference between the actual value of the dependent
variable and its predicted value for a particular observation is
the error of the estimate which is known as the "deviation" or
"residual". The goal of regression analysis is to determine the
values of the parameters which minimize the sum of the squared
residual values for the set of observations. This is known as a
"least squares" regression fit.
1.2 Introduction to Nonlin
Nonlin is a very powerful regression analysis program. Using it
you can perform multivariate, linear, polynomial, exponential,
logistic, and general nonlinear regression. What this means is
that you specify the form of the function to be fitted to the
data, and the function may include nonlinear terms such as
variables raised to powers and library functions such as log,
exponential, sine, etc. For complex analyses Nonlin allows you to
specify function models using conditional statements (IF, ELSE),
looping (FOR, DO, WHILE), work variables, and arrays. Nonlin uses
a state-of-the-art regression algorithm that works as well, or
better, than any you are likely to find in commercial statistical
packages.
As an example of nonlinear regression, consider another
depreciation problem. The value of a used airplane decreases for
each year of its age. Assuming the value of a plane falls by the
same amount each year, a linear function relating value to age is:
value = p0 + p1*Age
Where 'p0' and 'p1' are the parameters whose values are to be
determined. However, it is a well known fact that planes (and
automobiles) lose more value the first year than the second, and
more the second than the third, etc. This means that a linear
(straight line) function cannot accurately model this situation.
A better, nonlinear, function is:
value = p0 + p1*exp(-p2*Age)
Where the 'exp' function is the value of e (2.7182818...) raised
to a power. This type of function is known as "negative
exponential" and is appropriate for modeling a value whose rate of
decrease is proportional to the difference between the value and
Chapter 1. Introduction 3
some base value. The F33YEAR.NLR example command file fits a
linear function to the value of used airplanes. The F33EXP.NLR
example fits a negative exponential function to the same data.
Run both examples and compare the fitted functions. See F33.NLR
for an example of a multiple regression using three independent
variables.
Much of the convenience of Nonlin comes from the fact that you can
enter complicated functions using ordinary algebraic notation.
Examples of functions that can be handled with Nonlin include:
Linear: Y = p0 + p1*X
Quadratic: Y = p0 + p1*X + p2*X^2
Multivariate: Y = p0 + p1*X + p2*Z + p3*X*Z
Exponential: Y = p0 + p1*exp(X)
Periodic: Y = p0 + p1*sin(p2*X)
Misc: Y = p0 + p1*Y + p2*exp(Y) + p3*sin(Z)
In other words, the function is a general expression involving one
dependent variable (on the left of the equal sign), one or more
independent variables, and one or more parameters whose values are
to be estimated.
Because of its generality, Nonlin can perform all of the
regressions handled by ordinary linear or multivariate regression
programs as well as nonlinear regression. However, in order to
handle nonlinear functions, Nonlin uses an iterative function
optimization algorithm which is slower than the simple linear
regression algorithm and has the potential for not converging to a
solution.
Note: Some other regression programs claim to perform nonlinear
regression but actually do it by transforming the values of the
variables such that the function is converted to linear form.
They then perform a linear regression on the transformed function.
This technique has a major flaw: it determines the values of the
parameters that minimize the squared residuals for the
transformed, linearized function rather than the original
function. This is different than minimizing the squared residuals
for the actual function and the estimated values of the parameters
may not produce the best fit of the original function to the data.
Nonlin uses a true nonlinear regression technique that minimizes
the squared residuals for the actual function. Nonlin can also
handle functions that cannot be transformed to a linear form.
Chapter 1. Introduction 4
1.3 Installing Nonlin
The Nonlin system consists of the following files:
NONLIN.EXE -- The executable program.
NONLIN.DOC -- Documentation file.
NONLIN.FON -- Font file used if you request a plot.
NONLIN.LJF -- LaserJet font file (registered version only).
*.NLR -- Example command files.
REGISTER.DOC -- Form used to register your use of Nonlin.
To install Nonlin, copy the files into the directory of your
choice. The registered version of Nonlin includes a file named
NONLIN.LJF with the fonts needed for printing plots on HP LaserJet
printers. If you do not plan to generated hard copy output for a
LaserJet printer, you may delete the NONLIN.LJF file. If the
NONLIN.FON and NONLIN.LJF files are not in your current directory,
you must place a command of the following form in your
AUTOEXEC.BAT file to tell Nonlin where to look for its font files:
SET NONLIN=directory
Where "directory" is the name of the device and directory where
the files are located. For example, if the files are located in a
directory named NONLIN on the C disk, the following command could
be used:
SET NONLIN=C:\NONLIN
Chapter 2
Using Nonlin
Once Nonlin has been installed, it can be started using a DOS
command of the form:
NONLIN command_file [listing_file]
where "command_file" is the name of a file containing Nonlin
commands that control the analysis. The sections that follow
describe these commands. If you specify a command file name
without an extension, ".NLR" is used as the default extension.
A "listing_file" parameter may be specified on the command line.
If you specify a file name, the output (results) of the regression
analysis are written to this file. If no file name is specified,
the output is written to a file with the same name as the
command_file but with the extension ".LST". If you specify a
listing file name without an extension, ".LST" is provided as the
default extension. Specify NUL for the listing_file if you do not
want to generate an output file.
For example, to process a command file named LINEAR.NLR, directing
output to a file named LINEAR.LST, use the following command:
NONLIN LINEAR
To do the same analysis, directing the output to a file named
MODEL1.LST, use the following command:
NONLIN LINEAR MODEL1
Normally Nonlin commands and computed results are displayed on the
screen and written to the listing file; however, if you place a
NOECHO command in your command file the screen display is
suppressed but the output is still written to the listing file.
At this point, I suggest you pause in your reading and try running
a Nonlin example to get a feel for how it works. Several example
files with the extension ".NLR" are provided with the
distribution. LINEAR.NLR is a good one to start with followed by
AIDS.NLR. If you do not have a graphics monitor, edit the
LINEAR.NLR command file (and other example files) and remove the
PLOT command.
5
Chapter 2. Using Nonlin 6
2.1 Statement Syntax
The syntax for Nonlin statements follows the style of the C
programming language. Each statement must end with a semicolon
character, you may place more than one statement on a line, and
text strings and file names are enclosed in quote marks. Nonlin
has the same arithmetic and logical operators as C, and brace
characters ('{' and '}') are used to group statements.
The following keywords are reserved by Nonlin and may not be used
as the names of variables or parameters: angletype, break,
confidence, connect, connect2, constant, constrain, continue,
correlate, covariance, data, do, domain, double, else, for,
function, grid, if, iteration, iterations, model, noecho, nogrid,
nopause, notitle, noxlabel, noylabel, nplot, output, parameter,
parameters, plot, poutput, presolution, print, register, residual,
rplot, splot, stop, sweep, title, to, tolerance, values, var,
variable, variables, while, width, xlabel, xvar, xvar2, ylabel,
yvar, yvar2.
2.2 Variables and Parameters
Nonlin allows you to use four types of variables: input, computed,
system, and constant.
Input variables are variables whose values are set by observations
read from your data file. Input variables are declared using
the VARIABLES statement. For each step of the analysis Nonlin
cycles through each data observation and executes the
statements in your Nonlin command file. Each execution begins
by setting the input variables to the values for a specific
observation. If you want to transform input variables you
should assign the transformed values to computed variables
rather than modifying the values of input variables.
Computed variables are variables whose values are computed and
assigned in the Nonlin command file. They are declared using
the DOUBLE statement. You can use computed variables to hold
intermediate results of calculations or to hold transformed
values of input variables. Computed variables may hold single
values (scalars), lists of items (vectors), or arrays of
values. You may assign initial values to computed variables
when they are declared with the DOUBLE statement and you may
modify the values during the course of execution.
System variables hold values calculated by Nonlin during the
execution of your commands. There are three system variables:
PREDICTED --- the value of the dependent variable computed by
executing your function using the current parameter and
input variable values.
RESIDUAL --- the difference between predicted and actual value
of the dependent variable of the function.
Chapter 2. Using Nonlin 7
OBS --- the number of the observation record from your data
file that is currently being executed, the first record is
number 1.
The PREDICTED and RESIDUAL variables only have defined values
after the FUNCTION statement has been executed. They are
primarily useful in SPLOT and OUTPUT statements.
Symbolic constants are declared using the CONSTANT statement and
assigned values with their declaration. The values may not be
altered after the declaration. Symbolic constants may not
appear on the left side of an equal sign, otherwise they may
be used wherever variables or constants may be used.
There are two other designations for variables that should be
mentioned. An "independent" variable is a variable that appears
in the FUNCTION statement on the right side of the equal sign.
You may have many independent variables. The "dependent" variable
appears in the FUNCTION statement on the left side of the equal
sign. There will be only one dependent variable. If you use
multiple FUNCTION statements, the same dependent variable must be
used in each one. Input and computed variables may be used as
independent and dependent variables.
In addition to variables you will use the PARAMETERS statement to
declare parameters whose values are to be calculated by Nonlin.
Parameters are used like variables but there values are neither
read from the input file nor computed by statements in your
program, rather their value is determined by Nonlin so as to cause
the function to best fit your data observations.
2.3 Plots
Nonlin allows you to generate four types of plots as part of the
analysis. Any combination of these plot types may be requested
and you can generate multiple scatter plots by including more than
one SPLOT statement. With the registered version of Nonlin you
can cause hard copy images of the plots to be generated on HP
LaserJet printers. The plots are requested using the following
statements:
PLOT --- Generate a plot showing the computed value of the
function superimposed on a scatter plot of the input data
values. Nonlin evaluates the function using the computed
values of the parameters at many data points over the domain
and plots it as a smooth curve. Because Nonlin must evaluate
the function at points between the observations the PLOT
statement can only be used if your function has a single
independent variable which is an input variable (not a
computed variable). You may use symbolic constants in the
function.
SPLOT --- Generate a scatter plot with marks at (X,Y) coordinates
of data points. You may plot two sets of points on the same
graph for comparison purposes. Unlike the PLOT command, you
Chapter 2. Using Nonlin 8
may use any type of variable with the SPLOT command: input,
computed, system, and constant. Options are available to
cause the plots to be connected by straight line segments but
unlike the PLOT statement Nonlin will not compute curved
segments between the points. You can use multiple SPLOT
statements in your program to cause multiple scatter plots to
be generated.
RPLOT --- Generate a plot showing the residual values of the
function on the vertical (Y) axis. A "residual" value (or
error deviation) is the difference between an actual value of
the dependent variable of the function for an observation and
the predicted value based on the function fitted by the
regression analysis. A residual plot is useful for
determining where, and by how much, the fitted function fails
to predict the actual observations.
NPLOT --- Display a normal probability plot of the residual
values. In this plot, the actual value of each residual is
plotted on the vertical (Y) axis and the expected value of the
residual, assuming the residuals are normally distributed, is
plotted on the horizontal (X) axis. If the residuals are
normally distributed, the resulting plot will be a straight
line passing through the origin with a slope of 1. A normal
probability plot is useful for determining whether the
residuals are normally distributed. If they are not normally
distributed then the form of the function being fitted may be
inappropriate for the data.
2.4 Overview of Computation Process
Before getting into the details of how you present an analysis to
Nonlin it is a good idea to review the basic computation process.
You prepare a command file containing Nonlin statements to declare
variables, perform computations, and compute the value of a
function that you wish to have fitted to your data observations.
The function will have a dependent variable on the left side of
the equal sign and one or more independent variables and
parameters on the right side of the equal sign. Either input or
computed variables may be used as dependent and independent
variables.
For each observation record in your data file Nonlin executes your
statements and computes the value of the function. The computed
value of the function is assigned to the PREDICTED system
variable. The predicted value is then subtracted from the actual
value of the dependent variable and this difference is assigned to
the RESIDUAL system variable. This process is repeated for each
observation in the data file and the squared residual values are
added together. After all of the observations have been processed
Nonlin adjusts the values of the parameters whose values are to be
determined and repeats the process attempting to minimize the sum
of the squared residuals.
Chapter 2. Using Nonlin 9
It is important to understand what happens when Nonlin executes a
FUNCTION statement such as
Function y = a + b*x;
Unlike an assignment statement, this FUNCTION statement does not
assign a new value to the y variable. Rather it computes the
value of the expression on the right side of the equal sign using
the current values of the 'a' and 'b' parameters and assigns this
value to the PREDICTED system variable. It then subtracts this
from the current value of the y variable to determine the
residual.
2.5 Function Specification
Much of the power of Nonlin comes from its ability to estimate the
value of parameters that are part of complicated functions that
you enter in ordinary algebraic form.
Nonlin provides you with great power in defining the function.
Not only can you use a complicated expression to specify the
function, you can also use multiple statements complete with
intermediate work variables, conditional control (IF, ELSE), and
looping. The only requirement is that a FUNCTION statement must
be executed to define the estimated value of the dependent
variable.
The following section explains the arithmetic operators and
built-in functions that are used to specify a function.
2.5.1 Arithmetic Operators
The following arithmetic operators may be used in expressions:
++ add 1 to a variable
-- subtract 1 from a variable
+ addition
- subtraction or unary minus
* multiplication
/ division
% modulo
** or ^ exponentiation
The "++" and "--" operators may be used either immediately before
or after a variable name. If they are used before the name, the
increment or decrement is performed before the value of the
variable is used in the expression. If they are used after the
name, the value of the variable before being modified is used in
the expression and then the increment or decrement takes place.
For example, the sequence:
a = 3;
b = 3;
x = ++a;
y = b++;
Chapter 2. Using Nonlin 10
assigns the value 4 to x and 3 to y. At the end of the sequence,
both a and b have the value 4.
The following assignment operators can be used in expressions:
variable = expression; // Assign expression to variable
variable += expression; // Add expression to variable
variable -= expression; // Subtract expression from variable
variable *= expression; // Multiply variable by expression
variable /= expression; // Divide variable by expression
The following operators compare two values and produce a value of
1 if the comparison is true, or 0 if the comparison is false:
== Equal
!= Not equal
<= Less than or equal
>= Greater than or equal
< Less than
> Greater than
The following logical operators may be used:
! Logical NOT (negates true and false)
&& AND
|| OR
The conditional operator has the form:
operand1 ? operand2 : operand3
The value of operand1 is evaluated. If it is true (not zero) then
the value of operand2 is the result of the expression. If the
value of operand1 is false (zero) then operand3 is the result of
the expression.
There are two other special operators: "[...]" (square brackets)
which enclose subscripts on arrays (see the description of the
DOUBLE statement for information about arrays), and "," (comma)
which is used to specify left-to-right, sequential evaluation of a
list of expressions.
Operator precedence, in decreasing order, is as follows:
subscript, unary minus, logical NOT, ++ and --, exponentiation,
multiplication, division and modulo, addition and subtraction,
relational (comparison), logical AND, logical OR, conditional,
assignment, comma. Parentheses may be used to group terms.
2.5.2 Numeric Constants
Numeric constants may be written in their natural form (1, 0, 1.5,
.0003, etc.) or in exponential form, n.nnnEppp, where n.nnn is the
base value and ppp is the power of ten by which the base is
multiplied. For example, the number 1.5E4 is equivalent to 15000.
All numbers are treated as "floating point" values (actually,
Chapter 2. Using Nonlin 11
double precision), regardless of whether a decimal point is
specified or not.
2.5.3 Symbolic Constants
You can use the CONSTANT statement to associate symbolic names
with constant numeric values. When you use the symbolic name in
the function the numeric value is substituted for the symbolic
name. See Section 3.5, page 20. The PIECE.NLR example contains a
symbolic constant.
2.5.4 Built-in Constant
The symbolic name "PI" is equivalent to the value of pi,
3.14159... You may write PI using either upper or lower case.
2.5.5 Built-in Functions
The following functions are built into Nonlin and may be used in
expressions. The ANGLETYPE statement controls whether
trigonometric functions operate on angles in units of degrees or
radians (see Section 3.14, page 23).
ABS(x) -- Absolute value of x.
ACOS(x) -- Arc cosine of x.
ASIN(x) -- Arc sine of x.
ATAN(x) -- Arc tangent of x.
BETAI(x,a,b) -- Incomplete beta function: Ix(a,b). The incomplete
beta function can be used to compute a variety of statistical
functions. For example, the probability of Student's t with
'df' degrees of freedom can be computed with
BETAI(df/(df+t^2),.5*df,.5). The probability of the F
statistic with df1 and df2 degrees of freedom can be computed
with 2*BETAI(df2/(df2+df1*f),.5*df2,.5*df1).
CEIL(x) -- Ceiling of x (an equivalent name for this function is
INT). Returns the smallest integer that is at least as large
as x. For example, CEIL(1.5)=2; CEIL(4)=4; CEIL(-2.6)=-2.
COS(x) -- Cosine of x.
COSH(x) -- Hyperbolic cosine of x.
COT(x) -- Cotangent of x. (COT(x) = 1/TAN(x)).
CSC(X) -- Cosecant of x. (CSC(x) = 1/SIN(x)).
CTOP(angle) -- Convert an angle in the compass coordinate system
to a polar coordinate angle. The polar coordinate system has
the origin of an angle along the positive X axis and the angle
increases in a counter-clockwise direction. The compass
Chapter 2. Using Nonlin 12
coordinate system has the positive Y axis as the origin (i.e.,
north) and the angle increases in a clockwise direction. The
PTOC function performs the reverse transformation.
DEG(x) -- Converts an angle, x, measured in radians to the
equivalent number of degrees. See also the description of the
ANGLETYPE statement.
EI1(alpha,phi) -- Elliptic integral of the first kind. Computes
the integral from 0 to phi (degrees or radians) of the
function d.phi/sqrt(1-k**2*sin(phi)**2), where k = sin(alpha).
alpha and phi must be in the range 0 to 90 degrees or pi/2
radians. The ANGLETYPE statement determines whether the
angles are in units of degrees or radians.
EI2(alpha,phi) -- Elliptic integral of the second kind. Computes
the integral from 0 to phi degrees or radians of the function
sqrt(1-k**2*sin(phi)**2)*d.phi, where k = sin(alpha). alpha
and phi must be in the range 0 to 90 degrees or pi/2 radians
depending on the setting of ANGLETYPE.
EIC1(alpha) -- Complete elliptic integral of the first kind.
Computes the integral from 0 to 90 degrees (or pi/2 radians)
of the function d.phi/sqrt(1-k**2*sin(phi)**2), where k =
sin(alpha). alpha must be in the range 0 to (less than) 90
degrees (or pi/2 radians) depending on the setting of
ANGLETYPE.
EIC2(alpha) -- Complete elliptic integral of the second kind.
Computes the integral from 0 to 90 degrees (or pi/2 radians)
of the function sqrt(1-k**2*sin(phi)**2)*d.phi, where k =
sin(alpha). alpha must be in the range 0 to 90 degrees or
pi/2 radians depending on the setting of ANGLETYPE.
ERF(x) -- Standard error function of x.
EXP(x) -- e (base of natural logarithms) raised to the x power.
FAC(x) -- x factorial (x!). Note, the FAC function is computed
using the GAMMA function (FAC(x)=GAMMA(x+1)) so non-integer
argument values may be computed.
FLOOR(x) -- Floor of x. Returns the largest integer that is less
than or equal to x. For example, FLOOR(2.5)=2; FLOOR(4)=4;
FLOOR(-3.6)=-4.
GAMMA(x) -- Gamma function. Note, GAMMA(x+1) = x! (x factorial).
GAMMAI(x) -- Reciprocal of GAMMA function (GAMMAI(x) =
1/GAMMA(x)).
GAMMALN(x) -- Log (base e) of the GAMMA function.
HAV(x) -- Haversine of x. (HAV(x) = (1-COS(x))/2).
Chapter 2. Using Nonlin 13
INT(x) -- Ceiling of x (an equivalent name for this function is
CEIL). Returns the smallest integer that is at least as large
as x. For example, INT(1.5)=2; INT(4)=4; INT(-2.6)=-2.
J0(x) -- Bessel function of the first kind, order zero.
J1(x) -- Bessel function of the first kind, order one.
JN(n,x) -- Bessel function of the first kind, order n.
LOG(x) -- Natural logarithm of x.
LOG10(x) -- Base 10 logarithm of x.
LOG2(x) -- Base 2 logarithm of x.
MAX(x1,x2) -- Maximum value of x1 or x2.
MIN(x1,x2) -- Minimum value of x1 or x2.
NORMAL(x) -- Normal probability distribution of x. X is in units
of standard deviations from the mean. See also the NPD
function. NORMAL(x) = NPD(x,0,1);
NPD(x,mean,std) -- Normal probability distribution of x with
specified mean and standard deviation. X is in units of
standard deviations from the mean.
PAREA(x) -- Area under the normal probability distribution curve
from -infinity to x. (i.e., integral from -infinity to x of
NORMAL(x)).
PRINTF("format",value1,value2,...) -- Format and print a series of
values. The Nonlin printf function has the same syntax and
function as the printf function in the C language. It causes
a string to be written to your terminal and also the listing
file for the analysis. Printf is primarily useful as a
diagnostic tool to give you a way to observe what is happening
during an analysis. Note: since your commands are executed
for each data observation and each iteration, the printf may
generate a great deal of output.
The first argument to printf is a quoted string that contains
characters to be printed, control codes, and (if values are to
be printed) formatting specifications. If you are familiar
with the C programming language, the Nonlin formatting string
has the same form and control codes.
Ordinary characters and numbers in the format string are
printed just as they appear. Use the control code '\n' to
cause a carriage-return, line-feed sequence to be printed to
terminate a line. For example, the following command prints a
line of text:
Chapter 2. Using Nonlin 14
printf("Beginning of analysis\n");
If you wish to insert formatted values in the string, specify
one or more expressions after the format string. Place in the
format string at the location where you want to insert the
formatted value the sequence '%lf' (percent sign, lower case
L, f) if you want the number formatted in the style nnnn.nnnn
or '%lE' if you want exponential notation (nnn.nnnEnnn).
Optionally, you may specify the width of the formatted value
and the number of decimal places between '%' and 'l'. For
example, the following sequence produces a formatted value
with 8 total characters and 4 decimal places: %8.4lf. Here
are several examples:
printf("Processing observation %lf\n",obs);
printf("X = %lf, Y = %lf\n",x,y);
printf("Predicted = %14.6lE\n",predicted);
PTOC(angle) -- Convert an angle in the polar coordinate system to
a compass coordinate angle. The polar coordinate system has
the origin of an angle along the positive X axis and the angle
increases in a counter-clockwise direction. The compass
coordinate system has the positive Y axis as the origin (i.e.,
north) and the angle increases in a clockwise direction. The
CTOP function performs the reverse transformation.
PTORX(angle,distance) -- Convert a position in polar coordinates
to the corresponding rectangular coordinate. This function
returns the X coordinate of the position; use PTORY to obtain
the Y coordinate. Note: polar coordinates are specified with
the positive X axis being the origin for the angle and with
the angle increasing in the counter-clockwise direction.
PTORY(angle,distance) -- Convert a position in polar coordinates
to the corresponding rectangular coordinate. This function
returns the Y coordinate of the position; use PTORX to obtain
the X coordinate. Note: polar coordinates are specified with
the positive X axis being the origin for the angle and with
the angle increasing in the counter-clockwise direction.
PULSE(a,x,b) -- Pulse function. If the value of x is less than a
or greater than b, the value of the function is 0. If x is
greater than or equal to a and less than or equal to b, the
value of the function is 1. In other words, it is 1 for the
domain (a,b) and zero elsewhere. If you need a function that
is zero in the domain (a,b) and 1 elsewhere, use the
expression (1-PULSE(a,x,b)).
RAD(x) -- Converts an angle measured in degrees to the equivalent
number of radians. See also the description of the ANGLETYPE
statement.
RANDOM() -- Returns a random value uniformly distributed in the
range 0 to 1.
Chapter 2. Using Nonlin 15
ROUND(x) -- Rounds x to the nearest integer. For example,
ROUND(1.1)=1; ROUND(1.8)=2; ROUND(-2.8)=-3;
RTOPA(x,y) -- Convert a rectangular coordinate (x,y) to the
corresponding polar coordinate (angle,distance). This
function returns the angle, use RTOPD to get the distance
coordinate. Note: polar coordinates are specified with the
positive X axis being the origin for the angle and with the
angle increasing in the counter-clockwise direction.
RTOPD(x,y) -- Convert a rectangular coordinate to the
corresponding polar coordinate. This function returns the
distance from the origin, use RTOPA to get the angle. Note:
polar coordinates are specified with the positive X axis being
the origin for the angle and with the angle increasing in the
counter-clockwise direction.
SEC(x) -- Secant of x. (SEC(x) = 1/COS(x)).
SEL(a1,a2,v1,v2) -- If a1 is less than a2 then the value of the
function is v1. If a1 is greater than or equal to a2, then
the value of the function is v2.
SIN(x) -- Sine of x. See TREND.NLR for an example of a function
with a sin term. See also Section 4.9 for additional
information about using sin terms in functions.
SINH(x) -- Hyperbolic sine of x.
SQRT(x) -- Square root of x.
STEP(a,x) -- Step function. If x is less than a, the value of the
function is 0. If x is greater than or equal to a, the value
of the function is 1. If you need a function which is 1 up to
a certain value and then 0 beyond that value, use the
expression STEP(x,a). See PIECE.NLR for an example of this
function.
T(n,x) -- Chebyshev polynomial of order n.
TAN(x) -- Tangent of x.
TANH(x) -- Hyperbolic tangent of x.
Y0(x) -- Bessel function of the second kind, order zero.
Y1(x) -- Bessel function of the second kind, order one.
YN(n,x) -- Bessel function of the second kind, order n.
Chapter 2. Using Nonlin 16
2.6 Nonlin Command Files
The commands described in this section are placed in a command
file. When you start Nonlin, you specify the name of the command
file as a parameter on the command line. For example, if the
command file name is CAR.NLR, the following command would cause
Nonlin to execute the commands in the command file:
NONLIN CAR.NLR
If you do not specify a file name extension for the command file,
".NLR" is used by default. The output of the regression for this
example would be written to a file named CAR.LST. Command files
can be created using a text editor such as EDIT-32, EDLIN, the DOS
EDIT program, or any other editor or word processor that is
capable of creating an ascii text file without formatting codes.
2.7 Comments
The beginning of a comment is denoted with "//" (two consecutive
slash characters). Everything from the "//" sequence to the end
of the line is treated as a comment. Comments may be on lines by
themselves or on the ends of other statements. You can also
specify a comment by beginning the comment with the "/*" character
sequence. All characters following this are treated as comments
up to the matching "*/" sequence. The following lines illustrate
both types of comments:
// Function to be fitted
y = a + b*x; // Simple linear equation
/*
* This is a comment.
*/
z = y / 5; /* This is a comment too */
2.8 Include Files
Nonlin provides a #INCLUDE statement that you may place in your
command file to cause another file to be inserted in the command
file at the point where the #INCLUDE statement occurs. The
included file may contain any valid Nonlin statements or data that
would be appropriate at the specified point in the command file.
Processing of the statements in the original command file resume
when the end of the included file (and any nested files) is
reached. The form of the statement is:
#include "file"
where 'file' is the name of the file whose contents are to be
inserted. If no extension is specified, ".NLR" is used by
default. Include files may be nested up to 10 levels deep. The
following is an example of a Nonlin command file that includes the
function specification:
Chapter 2. Using Nonlin 17
Title "Example of file inclusion";
Variables X,Y;
Parameters a,b;
#include "fun1"; // Function statement is in "fun1.nlr"
data;
(data records follow)
2.9 Required Statements
Every command file must contain the following statements:
VARIABLES, PARAMETERS, FUNCTION, and DATA. The DATA statement
introduces the data for the analysis and must be the last
statement in the file (data records may follow it). Other,
optional, statements may be interspersed in the command file. The
following is an example of a complete command file:
title "Depreciation Example";
variables value,age,miles;
parameters base,depage,depmiles;
function value = base + depage*age + depmiles*miles;
data;
(data records follow)
Chapter 3
Nonlin Statements
The following is a list of the valid Nonlin statements that can be
placed in a Nonlin command file. Nonlin statements are not case
sensitive. Remember to end each statement with a semicolon.
3.1 TITLE
TITLE "string"; (optional) -- Specifies a title line that is
printed with the results of the analysis. Note: the title string
must be enclosed in quote signs.
3.2 VARIABLES
VARIABLES var1,var2,...; (required) -- Specifies the names of the
input variables whose values will be read from your data file.
The order of the variable names must match the order of the data
values on each observation record. You may define more variables
than you actually use in the function specification. A maximum of
25 variables may be specified. The length of a variable name is
limited to 10 characters. Capitalize the variable names as you
want them displayed in the results. The keyword "VARIABLE" may be
used instead of "VARIABLES".
You may specify all of the variables on a single statement or you
may use multiple VARIABLES statements. If you use multiple
statements, the order in which they appear in the command file
must match the order of the variable values on each observation
record. The VARIABLES statement must precede the FUNCTION
statement. See F33.NLR for an example of a multiple regression
using three independent variables.
You can also use the DOUBLE statement to declare variables (see
below). The difference is that the VARIABLES statement declares
variables that are read from the input file whereas the DOUBLE
statement declares variables whose values will be computed by
statements in your command file.
3.3 PARAMETERS
PARAMETERS param1[=initial1],param2[=initial2],...; (required) --
Specifies the names of the parameters whose values are to be
determined by Nonlin. Nonlin is capable of handling up to 25
parameters. The parameter names may not exceed 10 characters in
length. Do not specify any parameters that are not used in the
18
Chapter 3. Nonlin Statements 19
analysis. The PARAMETERS statement must precede the FUNCTION
statement. The keyword "PARAMETER" may be used instead of
"PARAMETERS".
Optionally, an initial estimate of the parameter value may be
specified by following the parameter name with an equal sign and
the value. If no value is specified, 1 is used by default.
Specifying an initial value that is near the actual value usually
speeds up the operation of Nonlin and may enable it to
successfully converge to a solution. If Nonlin is unable to
converge to a solution, try specifying different starting values
for the parameters. Try to specify a value that at least has the
correct sign as the expected final value.
The CONSTRAIN statement (see page 20) can be used to limit the
range of values for parameters. The SWEEP statement (see page 21)
can be used to perform the regression analysis with a range of
parameter initial values.
3.4 DOUBLE
DOUBLE var1[=value],var2[=value],...; (optional) -- Specifies the
names of computed variables that you may use subsequently to hold
calculated values. Nonlin allows you to define up to 30 computed
variables. All variables hold double precision (64 bit) floating
point values. Optionally, the name of a variable may be followed
by an equal sign and a value to which the variable is initialized.
If you do not specify an initial value, the variable is
initialized to 0. The following are examples of DOUBLE
statements:
double t1,t2;
double roomtemp=73;
It is convenient to use computed variables for intermediate
calculations such as transformed values of input variables.
Nonlin allows you to declare arrays with one or two dimensions.
To do this follow the name of the variable with number of array
elements enclosed in square brackets. If the array has two
dimensions specify the number of rows, then the number of columns,
separated by a comma. (Note: this is different than the C
language syntax for declaring a two-dimensional array). The
following statements declare a one dimensional array (i.e., a
vector) with 20 elements and a two dimensional array with 5 rows
and 10 columns:
double xvec[20];
double ya[5,10];
You may assign initial values to arrays by following the variable
declaration with an equal sign and a list of values enclosed in
curly braces. In the case of a two dimensional array, the values
should be specified by rows (i.e., the last subscript varies most
Chapter 3. Nonlin Statements 20
rapidly). The following are examples of array declarations with
initializations:
double xvec[5] = {2,5,7,1,0};
double xa[2,3] = {2.3,7.5,1.2,4.4,2.6,7.3};
When used in expressions the subscript values are 0 based. That
is, the first element of the array is referenced using a subscript
value of 0 and the last element is referenced using a subscript
value equal to one less than the number of elements in the array.
For example, the following statements would declare an array with
100 elements and initialize it:
double xsq[100],i;
for (i=0; i<100; i++) {
xsq[i] = i;
}
3.5 CONSTANT
CONSTANT variable=value; (optional) -- Specifies the name of a
symbolic constant and associates a numeric value. You can then
use the symbolic name where you would use a number and the
corresponding constant numeric value will be substituted. This is
useful when you are trying out different models and want to easily
be able to change a constant value for each run. The following is
an example of a symbolic constant named "Roomtemp" that causes the
value 73 to be substituted in the function:
Variable Time; // Cooling time in seconds
Variable Temp; // Temperature of object
Constant Roomtemp = 73; // Ambient temperature
Parameter InitTemp; // Initial temperature
Parameter Coolrate; // Cooling rate factor
Function Temp = Roomtemp + InitTemp * exp(-Coolrate * Time);
3.6 CONSTRAIN
CONSTRAIN parameter=lowvalue,highvalue; (optional) -- Specifies a
lower and upper limit on the range of a parameter value. During
the solution process, Nonlin may allow a parameter's value to
temporarily move in a direction away from its final value. With
some functions it may be necessary to constrain the parameter's
value so that it does not go negative (e.g., if the function takes
the square root of the parameter), or zero (if the parameter is in
a denominator). If a parameter is tightly constrained, Nonlin may
report "singular convergence" because it is unable to converge to
an optimum value of the parameter; however, the estimated values
of other parameters may be useful.
Only a single parameter and its associated limits may be specified
on each CONSTRAIN statement, but you may use multiple CONSTRAIN
statements. The PARAMETERS statement must precede the CONSTRAIN
statement. Use the CONSTANT statement if you wish to define a
parameter with a fixed value.
Chapter 3. Nonlin Statements 21
The parameter value is allowed to range from 'lowvalue' to
'highvalue'. If you want to prevent a parameter value from going
to zero, you must specify a value greater than zero for the low
value (specifying zero would allow it to reach, but not go below,
zero). For example, the following statement constrains the value
of 'age' to be greater than zero and less than or equal to 100:
constrain age = .0001,100;
See the COOLING.NLR, F33EXP.NLR, and POWER.NLR files for examples
of the CONSTRAIN statement.
3.7 SWEEP
SWEEP parameter=lowvalue,highvalue,stepsize; (optional) --
Specifies that the regression analysis is to be performed
repeatedly with a set of starting values for the parameter. The
first analysis is performed with the parameter having the
'lowvalue'; the value of 'stepsize' is then added to the
parameter's initial value and the analysis is performed again.
The process is repeated until the value of the parameter reaches
'highvalue'.
Each time the analysis is performed the value of the residual sum
of squares is compared with the best previous result. The
estimated values of the parameters for the best starting value are
saved and used for the final analysis and report.
Only one parameter may be specified on each SWEEP statement, but
you may have as many SWEEP statements as there are parameters.
The number of regression analyses performed will be equal to the
product of the number of parameter values for each SWEEP
statement.
The SWEEP statement is useful when you are trying to fit a
complicated function that may have "local minimum" values other
than the "global minimum". Periodic functions (sin, cos, etc.)
are especially troublesome.
See the SINE.NLR command file for an example of the SWEEP
statement.
3.8 FUNCTION
FUNCTION depvar = function; (required) -- Specifies the form of
the function whose parameters are to be determined. The dependent
variable must be the only thing to the left of the equal sign.
The expression to the right of the equal sign may contain
variables, parameters, constants, operators, and library functions
such as sqrt, sin, exp, etc. The VARIABLES and PARAMETERS
statements must appear in the command file before the FUNCTION
statement. The function may be specified using parameters, input
variables, computed variables (declared using the DOUBLE
statement), constants, and library functions. You may use more
than one FUNCTION statement if you use IF or other conditional
Chapter 3. Nonlin Statements 22
statements to select which one will be executed. However, during
each execution of your command file one, and only one, FUNCTION
statement must be executed. Some example FUNCTION statements are
show below:
Function y = p0 + p1*x;
Function distance = .5 * accel * time^2;
Function value = price + yrdep*age + miledep*miles;
Function populatn = base * growrate * exp(time);
3.9 CORRELATE
CORRELATE [var1,var2,...]; (optional) -- Causes Nonlin to compute
and print a correlation matrix. If you do not specify a list of
variables the correlation matrix includes all input variables. If
you wish to control exactly which variables are included in the
matrix or if you wish to include computed variables (declared with
a DOUBLE statement) you may specify a list of variables. See
Section 4.11 on page 41 for more information about correlation.
The F33.NLR example includes a CORRELATE statement. The following
are examples of the CORRELATE statement:
correlate;
correlate x1,x2,x3,y;
3.10 COVARIANCE
COVARIANCE; (optional) -- Causes the variance-covariance matrix
for the parameters to be printed.
3.11 CONFIDENCE
CONFIDENCE [percent]; (optional) -- Specifies that a confidence
interval is to be printed for each estimated parameter. The
purpose of regression analysis is to determine the best estimate
of parameter values. However, as with most statistical
calculations, the values determined are estimates of the true
values. The CONFIDENCE statement causes Nonlin to print a table
showing the range of possible values for each parameter given a
specified confidence value. The "percent" parameter specifies the
probability that that the actual value of the parameter is within
the confidence interval to be computed. For example, the
statement
Confidence 95;
specifies that the confidence interval(s) are to be computed such
that there is a 95 percent probability that the actual values of
the parameters are within the intervals (or that there is a 5
percent chance that the parameters are outside the intervals).
The "percent" parameter may range from 50 to 99.999. If the
Chapter 3. Nonlin Statements 23
CONFIDENCE statement is used without specifying a percent value,
90 is used by default.
3.12 TOLERANCE
TOLERANCE value; (optional, default=1E-10) -- Specifies the
tolerance factor that is used to determine when the algorithm has
converged to a solution. Reducing the tolerance value may produce
a slightly more accurate result but will increase the number of
iterations and the running time. The tolerance value must be in
the range 1E-15 to 1E-1. See Section 5.2 for additional
information about how the tolerance value is used to determine
when the function has converged.
3.13 ITERATIONS
ITERATIONS value; (optional, default=50) -- Specifies the maximum
number of iterations that should be attempted by the algorithm.
If the solution does not converge to the limit specified by the
TOLERANCE statement (or to the default tolerance) before the
maximum number of iterations is reached, the process is stopped
and the results are printed. Failure to converge before the
specified number of iterations could be caused by one of three
things:
1. The maximum allowed number of iterations may be too small.
Try using an ITERATIONS statement with a larger value.
2. The tolerance factor may be too small. Even a properly
converging solution will eventually "level off" or oscillate
around a good, but non-zero, sum of squares value. Try using
the TOLERANCE statement to increase the tolerance value.
3. The function may not be converging. Try specifying better (or
at least different) starting values for the parameters on the
PARAMETERS statement. Consider using the SWEEP statement to
specify a range of parameter starting values.
3.14 ANGLETYPE
ANGLETYPE DEGREES or RADIANS; (optional) -- Specifies whether
trigonometric library functions such as SIN, COS, TAN, etc. are
to operate in units of degrees or radians. The default setting is
degrees. You may only declare the angle type once in your program
and the declaration must come before any statements that use trig
functions. You can also use the DEG() and RAD() functions to
convert between degrees and radians. The following are example
ANGLETYPE statements:
angletype degrees;
angletype radians;
Chapter 3. Nonlin Statements 24
3.15 OUTPUT
OUTPUT [TO "file"] var1,var2,...; (optional) -- Specifies that
after the analysis is completed, data values are to be written to
a file. One record is written for each data observation in the
input file. If the TO "file" portion of the statement is
specified, the output is written to the specified file. If this
portion of the statement is omitted, the output values are written
to the listing file along with the results of the analysis. If a
file name is specified without an extension, ".OUT" is used by
default.
The list of variable names determines which variables are written
to the file and the order in which the values appear in each
output record. Any variable previously declared with a VARIABLES
or DOUBLE statement may be specified. In addition, the following
system variable names may appear in the output list:
OBS -- The observation record number, starting at 1 and increasing
by 1.
PREDICTED -- The predicted value for the dependent variable for
the observation, given the independent variable values and the
parameters as calculated by the analysis.
RESIDUAL -- The difference between the actual value of the
dependent variable and its predicted value.
Examples of OUTPUT statements are shown below:
output age,miles,value,predicted,residual;
output to "growth.dat" obs,time,populatn,predicted;
3.16 POUTPUT
POUTPUT "file"; (optional) -- The POUTPUT statement specifies that
Nonlin is to write the final estimated values of the parameters to
a file. Each parameter value is written to a separate line of the
file. This statement is useful to create a file of estimated
parameter values to be fed into another analysis program. This
statement can also be used to determine the parameter estimates to
more significant digits than displayed in the printed listing
because the format used by the POUTPUT statement writes the values
with 18 significant digits. The following is an example of a
POUTPUT statement:
poutput "params.dat";
3.17 PLOT
PLOT [options]; (optional) -- Display a plot of the calculated
function and the data observations. Each data point is displayed
with a blue 'X'; the function that Nonlin fits to the data is
superimposed as a yellow curve.
Chapter 3. Nonlin Statements 25
The PLOT statement can only be used if the FUNCTION declaration
meets the following requirements: (1) there is must only a single
independent variable; (2) the independent variable must be an
input variable (i.e., declared with a VARIABLES statement not a
DECLARE statement). You may use symbolic constants declared with
the CONSTANT statement. If the function does not meet these
requirements you may produce different types of plots using the
SPLOT, RPLOT and NPLOT statements.
You must have a CGA, EGA, or VGA monitor to use the PLOT
statement, and the NONLIN.FON font file must be in the current
directory or in a directory specified by the NONLIN environment
variable. Press Return to proceed with the analysis after you
finish looking at the plot.
The following options may be specified on the PLOT statement:
NOGRID -- suppress the grid lines that are normally displayed with
the plot.
TITLE="string" -- specify a title to be displayed with the plot.
If no title is specified the title defined by the TITLE
statement is used.
NOTITLE -- suppresses the title for the plot which, by default, is
the title specified with the TITLE statement.
XLABEL="string" -- specify a label to be printed along the X axis.
If you do not use this qualifier, the name of variable whose
values determine the X coordinates is used as the default
label.
NOXLABEL -- suppress printing any label along the X axis.
YLABEL="string" -- specify a label to be printed along the Y axis.
If you do not use this qualifier, the name of variable whose
values determine the Y coordinates is used as the default
label.
NOYLABEL -- suppress printing any label along the Y axis.
DOMAIN=lowvalue,hivalue -- specifies the domain over which the
plot is to be generated. If no domain is specified, Nonlin
uses the range of the independent variable for the domain.
RESIDUAL -- draw vertical lines from each observed data point to
the corresponding point on the calculated function line.
These lines represent the "residual" value that Nonlin is
attempting to minimize. See also the descriptions of the
RPLOT and NPLOT statements on pages 28 and 30.
ITERATION -- draw a plot for each iteration of the regression
analysis. Normally, the plot is drawn after the analysis has
converged to a solution; you may use the ITERATION option to
Chapter 3. Nonlin Statements 26
observe the function during each iteration of the analysis as
it converges to fit the data.
VALUES -- use in conjunction with the ITERATION option to cause
the current parameter values to be displayed before the plot
for the current iteration.
PRINT -- print a copy of the plot on an HP LaserJet printer. This
option is only available in the registered version of Nonlin.
Nonlin writes the plot to the PRN device which much be
attached to an HP Series II or Series III printer. The
NONLIN.LJF font file must be in the current directory or in a
directory specified by the NONLIN environment variable.
NOPAUSE -- do not pause after the plot is displayed. Normally,
Nonlin pauses after displaying a plot to allow you time to
examine it; you press Enter to continue execution once you
have finished looking at the plot. The NOPAUSE option causes
Nonlin to continue with execution without pausing after the
plot is displayed. This is useful in conjunction with the
PRINT option when Nonlin is run in a batch file and you want
to generate a hardcopy plot but not pause after the screen
display.
If more than one option is specified, separate them with commas.
For example, to produce a plot with X and Y axis labels use a
statement with the following form:
PLOT XLABEL="Time",YLABEL="Blood concentration";
3.18 SPLOT
SPLOT [options]; (optional) -- Display a scatter plot of (X,Y)
data points. Using the XVAR and YVAR options (see below) you can
specify which variable is used for the vertical (Y) dimension and
which is used for the horizontal (X) dimension. Any type of
variable may be specified including input variables, computed
variables (declared with the DOUBLE statement), the dependent
variable of the function, and the system variables PREDICTED,
RESIDUAL, and OBS (see Section 2.2 on page 6).
You may display two scatter plots on the same image. This is
useful for comparing computed values with input values. To do
this use the XVAR2 and YVAR2 options to specify the variables for
the X and Y dimensions for the second plot. Each data point for
the primary plot (specified by XVAR and YVAR) is marked with a
blue 'X'. The data points for the second plot (specified by XVAR2
and YVAR2) are marked with yellow triangles. You can use the
CONNECT and CONNECT2 options to draw straight line segments
through the points. The NOMARK and NOMARK2 options may be used to
suppress the data point markers.
The following options may be specified on the SPLOT statement:
Chapter 3. Nonlin Statements 27
XVAR=variable -- specify the variable to be used for the
horizontal (X) dimension of the first set of plotted points.
This can be any type of variable, input or computed. If you
do not specify this option and there is only a single
independent variable in the function, it is used by default.
YVAR=variable -- specify the variable to be used for the vertical
(Y) dimension. This can be any type of variable, input or
computed. If you do not specify this option then the
dependent variable of the function (i.e., the one on the left
of the equal sign) is used by default.
XVAR2=variable -- specify the variable to be used for the
horizontal (X) dimension of the second set of plotted points.
This can be any type of variable. If you specify YVAR2 but
not XVAR2, the default is the same variable as specified by
XVAR.
CONNECT -- Connect the first set of points by straight line
segments. The points are displayed and connected in the same
order that they appear in the data file.
CONNECT2 -- Connect the second set of points by straight line
segments.
NOMARK -- Suppress the display of the 'X' symbols that normally
mark the first set of data points. This can be used with
CONNECT to cause only the line to be drawn.
NOMARK2 -- Suppress the display of the triangle symbols that
normally mark the second set of data points.
NOGRID -- suppress the grid lines that are normally displayed with
the plot.
TITLE="string" -- specify a title to be displayed with the plot.
If no title is specified the title defined by the TITLE
statement is used.
NOTITLE -- suppresses the title for the plot which, by default, is
the title specified with the TITLE statement.
XLABEL="string" -- specify a label to be printed along the X axis.
If you do not use this qualifier, the name of variable whose
values determine the X coordinates is used as the default
label.
NOXLABEL -- suppress printing any label along the X axis.
YLABEL="string" -- specify a label to be printed along the Y axis.
If you do not use this qualifier, the name of variable whose
values determine the Y coordinates is used as the default
label.
Chapter 3. Nonlin Statements 28
NOYLABEL -- suppress printing any label along the Y axis.
DOMAIN=lowvalue,hivalue -- specifies the domain over which the
plot is to be generated. If no domain is specified, Nonlin
uses the range of the horizontal variable(s) for the domain.
PRINT -- print a copy of the plot on an HP LaserJet printer. This
option is only available in the registered version of Nonlin.
Nonlin writes the plot to the PRN device which much be
attached to an HP Series II or Series III printer. The
NONLIN.LJF font file must be in the current directory or in a
directory specified by the NONLIN environment variable.
NOPAUSE -- do not pause after the plot is displayed. Normally,
Nonlin pauses after displaying a plot to allow you time to
examine it; you press Enter to continue execution once you
have finished looking at the plot. The NOPAUSE option causes
Nonlin to continue with execution without pausing after the
plot is displayed.
If there is more than one option, separate them with commas. The
following is an example SPLOT statement:
splot xvar=time,yvar=sodium,yvar2=potassium,connect,connect2,
title="Blood concentration over time",
xlabel="Time (hours)",ylabel="Sodium & Potassium";
3.19 RPLOT
RPLOT [options]; (optional) -- Display a plot of the residual
values. A "residual" value (or error deviation) is the difference
between an actual value of the dependent variable for an
observation and the predicted value based on the function fitted
by the regression analysis. If the calculated function exactly
predicted the actual observation values, all of the residual
values would be zero. However, this is usually not the case and
the residual values show where, and by how much, the fitted
function fails to predict the actual observations.
The RPLOT statement causes Nonlin to display a plot showing the
residual values on the vertical (Y) axis. The variable plotted
along the horizontal (X) axis may be specified using the XVAR
option (see below). You may specify any variable including the
dependent variable and computed variables declared with the DOUBLE
statement. If you do not specify a variable and there is a single
independent variable in the function it is used. The X axis label
indicates which variable was used.
A residual plot is very useful for determining if the form of the
function being fitted is appropriate for the data values. If the
residual values are randomly distributed in positive and negative
directions then the form (shape) of the fitted function is
probably appropriate for the data and the deviations are due to
random measurement errors. If, however, the residuals show a
systematic pattern such as a periodic cycle, then the function may
Chapter 3. Nonlin Statements 29
not be appropriate for the data values. See the discussion of the
Durbin-Watson statistic in Section 4.9, page 39, for additional
information about autocorrelated residual values. The PLOT,
RPLOT, SPLOT, and NPLOT statements may be used in the same command
file. Press Return to proceed with the analysis after you have
finished looking at the plot.
The following options may be specified on the RPLOT statement:
XVAR=variable -- specify which variable is to be used for the
horizontal (X) dimension of the plot. You may specify any
variable including independent input variables, the dependent
variable of the function (i.e., the one on the left of the
equal sign), and computed or transformed variables declared
with the DOUBLE statement. If there is only a single
independent variable Nonlin will use it by default. The label
along the X axis indicates which variable was used.
NOGRID -- suppress the grid lines that are normally displayed with
the plot.
TITLE="string" -- specify a title to be displayed with the plot.
If this option is not specified, the default title is "Plot of
residuals".
NOTITLE -- suppresses the title for the plot which, by default, is
"Plot of residuals".
XLABEL="string" -- specify a label to be printed along the X axis.
If you do not use this qualifier, the name of variable whose
values determine the X coordinates is used as the default
label.
NOXLABEL -- suppress printing any label along the X axis.
YLABEL="string" -- specify a label to be printed along the Y axis.
If you do not use this qualifier, the default label is
"Residual".
NOYLABEL -- suppress printing any label along the Y axis.
DOMAIN=lowvalue,hivalue -- specifies the domain over which the
plot is to be generated. If no domain is specified, Nonlin
uses the range of the X dimension variable.
ITERATION -- draw a plot for each iteration of the regression
analysis. Normally, the plot is drawn after the analysis has
converged to a solution; you may use the ITERATION option to
observe the function during each iteration of the analysis as
it converges to fit the data.
VALUES -- use in conjunction with the ITERATION option to cause
the current parameter values to be displayed before the plot
for the current iteration.
Chapter 3. Nonlin Statements 30
PRINT -- print a copy of the plot on an HP LaserJet printer. This
option is only available in the registered version of Nonlin.
Nonlin writes the plot to the PRN device which much be
attached to an HP Series II or Series III printer. The
NONLIN.LJF font file must be in the current directory or in a
directory specified by the NONLIN environment variable.
NOPAUSE -- do not pause after the plot is displayed. Normally,
Nonlin pauses after displaying a plot to allow you time to
examine it; you press Enter to continue execution once you
have finished looking at the plot. The NOPAUSE option causes
Nonlin to continue with execution without pausing after the
plot is displayed.
If more than one option is specified, separate them with commas.
3.20 NPLOT
NPLOT [options] (optional) -- Display a normal probability plot of
the residual values. In this plot, the actual value of each
residual is plotted on the vertical (Y) axis and the expected
value of the residual, assuming the residuals are normally
distributed, is plotted on the horizontal (X) axis. If the
residuals are normally distributed, the resulting plot will be a
straight line passing through the origin with a slope of 1 (i.e.,
the actual value of each residual should equal the expected value
from the normal distribution). If the residuals are not normally
distributed, the plot will deviate from a straight line. Nonlin
displays a red line along which the X marks should be displayed if
the residuals are normally distributed.
This plot also computes the correlation between the actual
residual values and their expected values and displays the
correlation coefficient in the title line "(r=n.nnn)". If the
residual values are normally distributed, the correlation should
be close to 1.000. A correlation value less than 0.940 suggests
that the residuals are not normally distributed.
The PLOT, RPLOT, SPLOT, and NPLOT statements may be used in the
same command file. Press Return to proceed with the analysis
after you have finished looking at the plot.
The following options may be specified on the NPLOT statement:
GRID -- display grid lines to make it easier to estimate values.
TITLE="string" -- specify a title to be displayed with the plot.
If no title is specified the default title is "Normal
probability plot".
NOTITLE -- suppresses the title for the plot.
XLABEL="string" -- specify a label to be printed along the X axis.
If you do not use this qualifier, default label is "Expected
residuals".
Chapter 3. Nonlin Statements 31
NOXLABEL -- suppress printing any label along the X axis.
YLABEL="string" -- specify a label to be printed along the Y axis.
If you do not use this qualifier, the default label is "Actual
residuals".
NOYLABEL -- suppress printing any label along the Y axis.
ITERATION -- draw a plot for each iteration of the regression
analysis. Normally, the plot is drawn after the analysis has
converged to a solution; you may use the ITERATION option to
observe the function during each iteration of the analysis as
it converges to fit the data.
VALUES -- use in conjunction with the ITERATION option to cause
the current parameter values to be displayed before the plot
for the current iteration.
PRINT -- print a copy of the plot on an HP LaserJet printer. This
option is only available in the registered version of Nonlin.
Nonlin writes the plot to the PRN device which much be
attached to an HP Series II or Series III printer.
NOPAUSE -- do not pause after the plot is displayed. Normally,
Nonlin pauses after displaying a plot to allow you time to
examine it; you press Enter to continue execution once you
have finished looking at the plot. The NOPAUSE option causes
Nonlin to continue with execution without pausing after the
plot is displayed.
If more than one option is specified, separate them with commas.
3.21 PRESOLUTION
PRESOLUTION value; (optional) -- Specifies whether plots sent to
HP LaserJet printers should use 150 or 300 dot-per-inch
resolution. This option is only available in the registered
version of Nonlin. The value parameter must be 150 or 300. The
default value is 150 causes the plots to use most of the
horizontal width of an 8.5x11 inch page. These plots are suitable
for direct transfer to overhead transparencies. Specifying 300
for the resolution produces smaller plots that are suitable for
inclusion in printed documents.
3.22 WIDTH
WIDTH value; (optional) -- Specify the width, in inches, of
printed plots. This option is only available in the registered
version of Nonlin. Due to memory space considerations, the
maximum width is limited to about 7.9 inches for 150 DPI
resolution and 4.5 inches for 300 DPI resolution. If you have
limited memory space, you may have to reduce the width to be able
to produce printed plots. This statement is ignored unless you
request that a plot be printed.
Chapter 3. Nonlin Statements 32
3.23 NOECHO
NOECHO; (optional) -- Specifies that the statements and computed
results are not to be listed on the screen. The output is still
written to the listing file and any requested plots are displayed
on the screen.
3.24 Assignment Statement
The assignment statement is an executable statement that evaluates
an expression and assigns its value to a variable. The syntax for
an assignment statement is:
variable = expression; // Assign expression to variable
variable += expression; // Add expression to variable
variable -= expression; // Subtract expression from variable
variable *= expression; // Multiply variable by expression
variable /= expression; // Divide variable by expression
where "variable" is a variable that was previously declared using
a DOUBLE statement. The variable may be subscripted if it is an
array. "expression" is a valid arithmetic or logical expression
following the rules explained earlier. If the expression involves
a relational comparison operator (e.g., <, >, >=, etc.) or a
logical operation (&&, ||, !), the value 1 is used for true and 0
for false. The expression may contain any type of variable
(input, computed, or constant) along with parameters and library
functions.
3.25 IF Statement
The form of the IF statement is:
IF (expression) statement1 [ELSE statement2]
If the expression is true (not zero) statement1 is executed, if
the expression is false (0) and the ELSE clause is specified,
statement2 is executed. The ELSE clause and the second set of
controlled statements are optional. You may control groups of
statements by enclosing them in braces. The following are
examples of valid IF statements:
if (x > bigx) bigx = x;
if (x < Pivot) {
Function Y = B0+B1*(X-Pivot);
} else {
Function Y = B0+B2*(X-Pivot);
}
The PIECE.NLR command file contains an example of an IF statement.
Chapter 3. Nonlin Statements 33
3.26 WHILE Statement
The WHILE statement loops until the controlling expression becomes
false (0) or a BREAK statement is executed within the loop. The
form of the WHILE statement is:
WHILE (expression) {
<< controlled statements >>
}
Each time around the loop the expression is evaluated. If it is
true (non zero) the controlled statements are executed and then
the process repeats until the expression becomes false. If a
BREAK statement is executed within the loop, execution of the loop
terminates and control is transferred to the first statement
beyond the end of the loop. If a CONTINUE statement is executed
in the loop, control is transferred to the conditional test at the
top of the loop. The following is an example of a WHILE
statement:
while (x < 5) {
x = x + xmove;
y = y + ymove;
}
3.27 DO Statement
The DO statement is very similar to the WHILE statement except the
control expression is evaluated at the end of the loop rather than
the beginning. This causes the loop always to be executed at
least once. The form of the DO statement is:
DO {
<< controlled statements >>
WHILE (expression);
For each iteration of the loop the controlled statements are
executed and then the conditional expression is evaluated. If it
is true (non-zero) control transfers to the first controlled
statement at the top of the loop. A BREAK statement may be used
to terminate the loop before the conditional expression is
evaluated. A CONTINUE statement can be used to cause control to
be transferred from within the loop to the point where the
conditional expression is evaluated. The following is an example
of a DO statement:
do {
x += xstep;
y += ystep;
} while (x < limit);
Chapter 3. Nonlin Statements 34
3.28 FOR Statement
The FOR statement is a looping control statement similar to the
WHILE statement; however, the FOR statement also allows you to
specify initialization expressions that are executed once at the
beginning of the loop, and loop-end expressions that are executed
at the end of each loop cycle. The form of the FOR statement is:
FOR (expression1; expression2; expression3) statement;
Execution of a FOR statement proceeds as follows:
1. Evaluate expression1. Typically this expression will include
assignment operators ("=") to set initial values for loop
variables. If you need more than one initial expression,
specify them as a list separated by commas.
2. Evaluate expression2. If its value is false (0) terminate the
FOR statement and transfer control to the statement that
follows the controlled statement. If expression2 is true,
proceed to the next step.
3. Execute the controlled statement. If more than one statement
is to be controlled, enclose them with brace characters ("{"
"}").
4. Evaluate expression3. This expression will typically contain
operators such as "++", "+=", "--", or "-=" to modify the
value of a loop variable.
5. Transfer control to step 2, where expression2 is once again
evaluated.
The following is an example of a FOR statement:
for (time=starttime; time<endtime; time+=timestep) {
<< controlled statements >>
}
3.29 BREAK Statement
The BREAK statement can be used in FOR, WHILE, and DO loops to
terminate the loop and cause control to transfer to the statement
beyond the end of the loop. The following is an example of a
BREAK statement:
time = 0;
x = 0;
while (time < endtime) {
x += delta * xspeed;
if (x > 10) break;
}
Chapter 3. Nonlin Statements 35
3.30 CONTINUE Statement
The CONTINUE statement can be used in FOR, WHILE, and DO loops to
terminate the current iteration and begin the next one. When
CONTINUE is executed in a WHILE or DO statement, control is
transferred to the point in the loop where the loop control
expression is evaluated. When CONTINUE is executed in a FOR
statement, control is transferred to the bottom of the loop where
expression3 is evaluated (which normally augments the values of
the loop variables for the next iteration). The form of the
CONTINUE statement is:
continue;
3.31 STOP Statement
The STOP statement terminates the calculations for the current
iteration. The last value of the independent variable (as
specified with a FUNCTION statement) is used as the calculated
value of the function. An implicit stop occurs if you "fall
through" the last executable statement. The form of the STOP
statement is:
stop;
3.32 DATA
DATA ["file"]; (required) -- Specifies the name of the file
containing the data records, or introduces the data records which
follow the statement. If a file name is specified on the DATA
statement, the file is opened, its data records are read, and the
regression analysis is performed. If a file name is specified
without an extension, ".DAT" is used by default. Note that if you
specify a file name it must be enclosed in quote marks.
If no file name is specified on the DATA statement, the data
records must immediately follow the DATA statement in the command
file.
Each data record must contain at least as many data values as the
number of variables specified on the VARIABLES statement(s). The
order of the variables as specified on the VARIABLES statement
must match the order of the values in each observation. Any data
values beyond those required for the specified variables are
ignored. Each observation must begin on a new line.
The data values must be separated by one or more spaces and/or a
comma. You may place a comment on the end of a data record by
beginning the comment with "//". Data values may contain decimal
points and may be expressed in exponential notation (i.e.,
n.nnnnEppp).
The DATA statement must be the last statement in the command file.
If no file name is specified on the DATA statement, the data
records must immediately follow the DATA statement in the command
Chapter 3. Nonlin Statements 36
file. The following is an example of a complete command file
including data records:
Variables age,miles,value;
Parameters base,depage,depmiles;
Function value = base + depage*age + depmiles*miles;
Data;
2 10000 13000
4 42000 9000
1 7000 17000
6 52000 6000
5 48000 8000
If the data records had been placed in a separate file named
CAR.DAT, the statements would read as follows:
Variables age,miles,value;
Parameters base,depage,depmiles;
Function value = base + depage*age + depmiles*miles;
Data "car.dat";
Chapter 4
Understanding The Results
4.1 Descriptive Statistics for Variables
Nonlin prints a variety of statistics at the end of each analysis.
For each variable, Nonlin lists the minimum value, the maximum
value, the mean value, and the standard deviation. You should
confirm that these values are within the ranges you expect.
4.2 Parameter Estimates
For each parameter, Nonlin displays the initial parameter estimate
(which you specified on the PARAMETER statement, or 1 by default),
the final (maximum likelihood) estimate, the standard error of the
estimated parameter value, the "t" statistic comparing the
estimated parameter value with zero, and the significance of the t
statistic. Nine significant digits are displayed for the
parameter estimates. If you need to determine the parameters to
greater precision, use the POUTPUT statement.
The final estimate parameter values are the results of the
analysis. By substituting these values in the equation you
specified to be fitted to the data, you will have a function that
can be used to predict the value of the dependent variable based
on a set of values for the independent variables. For example, if
the equation being fitted is
y = p0 + p1*x
and the final estimates are 1.5 for p0 and 3 for p1, then the
equation
y = 1.5 + 3*x
is the best equation of this form that will predict the value of y
based on the value of x.
4.3 t Statistic
The "t" statistic is computed by dividing the estimated value of
the parameter by its standard error. This statistic is a measure
of the likelihood that the actual value of the parameter is not
zero. The larger the absolute value of t, the less likely that
the actual value of the parameter could be zero.
37
Chapter 4. Understanding The Results 38
4.4 Prob(t)
The "Prob(t)" value is the probability of obtaining the estimated
value of the parameter if the actual parameter value is zero. The
smaller the value of Prob(t), the more significant the parameter
and the less likely that the actual parameter value is zero. For
example, assume the estimated value of a parameter is 1.0 and its
standard error is 0.7. Then the t value would be 1.43 (1.0/0.7).
If the computed Prob(t) value was 0.05 then this indicates that
there is only a 0.05 (5%) chance that the actual value of the
parameter could be zero. If Prob(t) was 0.001 this indicates
there is only 1 chance in 1000 that the parameter could be zero.
If Prob(t) was 0.92 this indicates that there is a 92% probability
that the actual value of the parameter could be zero; this implies
that the term of the regression equation containing the parameter
can be eliminated without significantly affecting the accuracy of
the regression.
One thing that can cause Prob(t) to be 1.00 (or near 1.00) is
having redundant parameters. If at the end of an analysis several
parameters have Prob(t) values of 1.00, check the function
carefully to see if one or more of the parameters can be removed.
Also try using a DOUBLE statement to set one or more of the
parameters to a reasonable fixed value; if the other parameters
suddenly become significant (i.e., Prob(t) much less than 1.00)
then the parameters are mutually dependent and one or more should
be removed. See Section 6.2 for more information about mutually
dependent parameters.
The t statistic probability is computed using a two-sided test.
The CONFIDENCE statement can be used to cause Nonlin to print
confidence intervals for parameter values. The SQUARE.NLR example
regression includes an extraneous parameter (p0) whose estimated
value is much smaller than its standard error; the Prob(t) value
is 0.99982 indicating that there is a high probability that the
value is zero.
4.5 Final Sum of Squared Deviations
In addition to the variable and parameter values, Nonlin displays
several statistics that indicate how well the equation fits the
data. The "Final sum of squared deviations" is the sum of the
squared differences between the actual value of the dependent
variable for each observation and the value predicted by the
function, using the final parameter estimates.
4.6 Average and Maximum Deviation
The "Average deviation" is the average over all observations of
the absolute value of the difference between the actual value of
the dependent variable and its predicted value.
The "Maximum deviation for any observation" is the maximum
difference (ignoring sign) between the actual and predicted value
of the dependent variable for any observation.
Chapter 4. Understanding The Results 39
4.7 Proportion of Variance Explained
The "Proportion of variance explained (R^2)" indicates how much
better the function predicts the dependent variable than just
using the mean value of the dependent variable. This is also
known as the "coefficient of multiple determination." It is
computed as follows: Suppose that we did not fit an equation to
the data and ignored all information about the independent
variables in each observation. Then, the best prediction for the
dependent variable value for any observation would be the mean
value of the dependent variable over all observations. The
"variance" is the sum of the squared differences between the mean
value and the value of the dependent variable for each
observation. Now, if we use our fitted function to predict the
value of the dependent variable, rather than using the mean value,
a second kind of variance can be computed by taking the sum of the
squared difference between the value of the dependent variable
predicted by the function and the actual value. Hopefully, the
variance computed by using the values predicted by the function is
better (i.e., a smaller value) than the variance computed using
the mean value. The "Proportion of variance explained" is
computed as 1 - (variance using predicted value / variance using
mean). If the function perfectly predicts the observed data, the
value of this statistic will be 1.00 (100%). If the function does
no better a job of predicting the dependent variable than using
the mean, the value will be 0.00.
4.8 Adjusted Coefficient of Multiple Determination
The "adjusted coefficient of multiple determination (Ra^2)" is an
R^2 statistic adjusted for the number of parameters in the
equation and the number of data observations. It is a more
conservative estimate of the percent of variance explained,
especially when the sample size is small compared to the number of
parameters. It is computed using the formula:
Ra^2 = 1 - (n-1)/(n-p) * (1-R^2)
where 'n' is the number of observations, 'p' is the number of
parameters, and 'R^2' is the unadjusted coefficient of multiple
determination.
4.9 Durbin-Watson Statistic
The "Durbin-Watson test for autocorrelation" is a statistic that
indicates the likelihood that the deviation (error) values for the
regression have a first-order autoregression component. The
regression models assume that the error deviations are
uncorrelated.
In business and economics, many regression applications involve
time series data. If a non-periodic function, such as a straight
line, is fitted to periodic data the deviations have a periodic
form and are positively correlated over time; these deviations are
said to be "autocorrelated" or "serially correlated."
Chapter 4. Understanding The Results 40
Autocorrelated deviations may also indicate that the form (shape)
of the function being fitted is inappropriate for the data values
(e.g., a linear equation fitted to quadratic data).
If the deviations are autocorrelated, there may be a number of
consequences for the computed results: 1) The estimated regression
coefficients no longer have the minimum variance property; 2) the
mean square error (MSE) may seriously underestimate the variance
of the error terms; 3) the computed standard error of the
estimated parameter values may underestimate the true standard
error, in which case the t values and confidence intervals may be
incorrect. Note that if an appropriate periodic function is
fitted to periodic data, the deviations from the regression will
be uncorrelated because the cycle of the data values is accounted
for by the fitted function.
Small values of the Durbin-Watson statistic indicate the presence
of autocorrelation. Consult significance tables in a good
statistics book for exact interpretations; however, a value less
than 0.80 usually indicates that autocorrelation is likely. If
the Durbin-Watson statistic indicates that the residual values are
autocorrelated, it is recommended that you use the RPLOT and/or
NPLOT statements to display a plot of the residual values.
If the data has a regular, periodic component you can try
including a sin term in your function. The TREND.NLR example fits
a function with a sin term to data that has a linear growth with a
superimposed sin component. With the sin term the function has a
residual value of 29.39 and a Durbin-Watson value of 2.001;
without the sin term (i.e., fitting only a linear function) the
residual value is 119.16 and the Durbin-Watson value is 0.624
indicating strong autocorrelation. The general form of a sin term
is
amplitude*sin(2*pi*(x-phase)/period)
where 'amplitude' is a parameter that determines the magnitude of
the sin component, 'period' determines the period of the
oscillation, and 'phase' determines the phase relative to the
starting value. If you know the period (e.g., 12 for monthly data
with an annual cycle) you should specify it rather than having
Nonlin attempt to determine it.
If an NPLOT statement is used to produce a normal probability plot
of the residuals, the correlation between the residuals and their
expected values (assuming they are normally distributed) is
printed in the listing. If the residuals are normally
distributed, the correlation should be close to 1.00. A
correlation less than 0.94 suggests that the residuals are not
normally distributed.
Chapter 4. Understanding The Results 41
4.10 Analysis of Variance Table
An "Analysis of Variance" table provides statistics about the
overall significance of the model being fitted.
4.11 Correlation Matrix
The CORRELATE statement can be used to cause Nonlin to print a
correlation matrix. A "correlation coefficient" is a value that
indicates whether there is a linear relationship between two
variables. The absolute value of the correlation coeffecient will
be in the range 0 to 1. A value of 0 indicates that there is no
relationship whereas a value of 1 indicates that there is a
perfect correlation and the two variables vary together. The sign
of the correlation coefficient will be negative if there is an
inverse relationship between the variables (i.e., as one increases
the other decreases).
For example, consider a study measuring the height and weight of a
group of individuals. The correlation coefficient between height
and weight will likely have a positive value somewhat less than
one because tall people tend to weight more than short people. A
study comparing number of cigarettes smoked with age at death will
probably have a negative correlation value.
A correlation matrix shows the correlation between each pair of
variables. The diagonal of the matrix has values of 1.00 because
a variable always has a perfect correlation with itself. The
matrix is symmetric about the diagonal because X correlated with Y
is the same as Y correlated with X.
Problems occur in regression analysis when a function is specified
that has multiple independent variables that are highly
correlated. The common interpretation of the computed regression
parameters as measuring the change in the expected value of the
dependent variable when the corresponding independent variable is
varied while all other independent variables are held constant is
not fully applicable when a high degree of correlation exists.
This is due to the fact that with highly correlated independent
variables it is difficult to attribute changes in the dependent
variable to one of the independent variables rather than another.
The following are effects of fitting a function with high
correlated independent variables:
1. Large changes in the estimated regression parameters may occur
when a variable is added or deleted, or when an observation is
added or deleted.
2. Individual tests on the regression parameters may show the
parameters to be nonsignificant.
3. Regression parameters may have the oppsite algebraic sign than
expected from theoretical or practical considerations.
Chapter 4. Understanding The Results 42
4. The conficence intervals for important regression parameters
may be be much wider than would otherwise be the case.
The solution to these problems may be to select the most
significent of the correlated variables and use only it in the
function.
Note: the correlation coefficients indicate the degree of linear
association between variables. Variables may be highly related in
a nonlinear fashion and still have a correlation coefficent near
0.
Chapter 5
Theory of Operation
5.1 Minimization Algorithm
Nonlin uses a model/trust-region technique along with an adaptive
choice of the model Hessian. The algorithm is essentially a
combination of Gauss-Newton and Levenberg-Marquardt methods;
however, the adaptive algorithm often works much better than
either of these methods alone.
The basis for the minimization technique used by Nonlin is to
compute the sum of the squared residuals for one set of parameter
values and then slightly alter each parameter value and recompute
the sum of squared residuals to see how the parameter value change
affects the sum of the squared residuals. By dividing the
difference between the original and new sum of squared residual
values by the amount the parameter was altered, Nonlin is able to
determine the approximate partial derivative with respect to the
parameter. This partial derivative is used by Nonlin to decide
how to alter the value of the parameter for the next iteration.
If the function being modeled is well behaved, and the starting
value for the parameter is not too far from the optimum value, the
procedure will eventually converge to the best estimate for the
parameter. This procedure is carried out simultaneously for all
parameters and is, in fact, a minimization problem in
n-dimensional space, where 'n' is the number of parameters.
For a much more detailed explanation of the regression algorithm
used by Nonlin see ACM Transactions on Mathematical Software 7,3
(Sept. 1981) "Dennis, J.E., Gay, D.M., and Welsch, R.E. -- An
adaptive nonlinear least-squares algorithm."
5.2 Convergence Criterion
Nonlin has several convergence criteria that stop the iterative
minimization procedure. The TOLERANCE statement can be used to
alter the convergence tolerance value.
Two internal variables are used to determine when convergence has
occurred. RFCTOL has a default value of 1E-10 and can be altered
by use of the TOLERANCE statement. AFCTOL has a default value of
1E-20 and is only altered by the TOLERANCE statement if the value
specified is less than the default value. In the discussion which
43
Chapter 5. Theory of Operation 44
follows the "function value" is half the sum of the squared
residuals computed using the current parameter estimates.
"Relative function convergence" is reported if the predicted
maximum possible function reduction is at most RFCTOL*ABS(F0)
where F0 is the function value at the start of the current
iteration, and if the last step attempted achieved no more than
twice the predicted function decrease.
"Absolute function convergence" is reported if the function value
is less than AFCTOL.
Chapter 6
Hints for Nonlin Use
6.1 Convergence Failures
One of the potential problems that confronts any nonlinear
minimization procedure is non-convergence. Non-convergence is
usually not a problem for regressions using a linear model, but
becomes a more serious consideration when using complicated
nonlinear functions; increasing the number of parameters
aggravates the problem.
Non-convergence can occur in two ways: the solution may diverge or
it may converge to the wrong solution -- a local minimum rather
than the global minimum. Periodic functions, such as sin, and
cos, are particularly prone to convergence problems. For example,
consider a nonlinear regression performed with the function:
y = offset + amplitude * sin(frequency * x)
where x and y are variables, and offset, amplitude, and frequency
are the parameters whose values are to be determined. If the
starting value for frequency is not reasonably close to the
correct value, the solution may converge to a harmonic (multiple)
or subharmonic (fundamental) value of the frequency. A command
file named SINE.NLR is supplied with the statements and data to
perform this analysis.
The SWEEP statement can be very useful in cases like the sine
example. In the SINE.NLR example analysis, the actual value of
the frequency is 3; the function converges to the correct solution
if the starting value is in the range 2.6 to 3.3. However, this
example is quite insensitive to the starting value of the
amplitude parameter. With an actual value of 2, the correct
solution is found with starting values from 1 through 10000.
Similarly, the offset parameter, which had an actual value of 10,
was successfully determined with starting values ranging from 1 to
over 50000.
Another example which is sensitive to a parameter starting value
is POWER.NLR which attempts to determine the values of the
parameters p0, p1, and p2 for the function
y = p0 + p1*x^p2
45
Chapter 6. Hints for Nonlin Use 46
(where "x^p2" means x raised to the p2 power). The actual value
of p2 in the example data is 2; the solution converges correctly
if the starting value of p2 is in the range 1.8 to 3.8. As with
the other example, the solution is relatively insensitive to the
starting values of p0 and p1.
6.2 Singular Matrix Problems
Another possible problem is that the analysis may stop with the
message "Singular convergence. Mutually dependent parameters?".
This is usually due to one of two things: (1) a redundant
parameter that is co-dependent with another parameter, or (2) a
situation where the value of one parameter "blocks" the effect of
other parameters. As an example of a redundant parameter,
consider the function
y = p0 + p1*p2*x
This is a simple linear equation except there are two parameters,
p1, and p2, which are both factors to the variable x. It should
be clear that there is no unique solution to this problem since
any value of p1 is possible if the right value of p2 is chosen.
Similarly, the function
y = p0 + p1 + p2*x
has no unique solution since either p0 or p1 is redundant.
Similarly, in the equation
y = p0 + p1*exp(x+p2)
either p1 or p2 is redundant.
The second type of singular matrix problem can be illustrated by
the function
y = p0 + p1*x^p2
If, during the solution process, p1 takes on the value 0, then
varying the value of p2 has no effect on the equation and Nonlin
cannot figure out which way to change the value of p2 to move
toward convergence. The solution to this problem is to assign a
starting value that is not zero to p1, and use the CONSTRAIN
statement to force p1 to remain non-zero.
6.3 Performance Issues
Nonlin is carefully programmed and compiled with an optimizing
compiler for maximum performance. However, Nonlin is a real
"number cruncher," and the nonlinear regression algorithm is
mathematically very elaborate. During each iteration, Nonlin
computes gradients, Jacobians, Hessians, and eigenvalues, and
performs QR and Cholesky matrix decompositions. All calculations
are carried out using double precision (64 bit) floating point.
Chapter 6. Hints for Nonlin Use 47
Nonlin does not require an 80x87 numeric coprocessor, but its
performance is greatly enhanced if one is present. In fact, an
8088 CPU with an 8087 numeric coprocessor can perform regression
analyses faster than a 20 MHz 80386 that does not have a
coprocessor. If you have an 8088 without a coprocessor, be
patient -- Nonlin is probably giving it the workout of its life.
Very long running times can result if you use the SWEEP statement
with many starting values. The problem is compounded if you have
multiple SWEEP statements. If you use the SWEEP statement to try
a large number of starting parameter values, you can save time by
using the ITERATIONS statement to specify a small number of
iterations (such as 5) during the initial attempt to find a
solution. Once a feasible set of starting parameter values has
been determined, remove the SWEEP statement, specify the starting
values on the PARAMETERS statement, increase the number of
iterations, and rerun the analysis to get the final result.
6.4 Program Limits
The following is a summary of the Nonlin program limitations:
Maximum number of variables = 25
Maximum number of parameters = 25
Maximum length of variable or parameter names = 10
The maximum number of data observations that Nonlin can handle
depends on the number of parameters as shown by the table that
follows:
# Parameters Max Observations
1 2019
2 1611
3 1339
4 1144
5 997
6 883
7 791
8 715
9 652
10 599
Chapter 7
Example Analyses
A number of example regression analysis files are provided with
your Nonlin distribution. All of the example command files have
the extension ".NLR". Some of the important ones are described
below, others contain comment lines that explain what they do.
LINEAR.NLR -- Simple linear regression with plotted function and
data.
QUAD.NLR -- Fit a quadratic equation. Plot the function and the
data.
ASYMPTOT.NLR -- Fit an asymptotic function Y = 12 - 10/X.
AIDS.NLR -- A logistic curve is a growth curve used to model
functions which increase gradually at first, more rapidly in
the middle growth period, and slowly at the end, leveling off
at a maximum value after some period of time. This type of
curve is frequently used to model biological growth patterns
where there is an initial exponential growth period followed
by a leveling off as more of the population is infected or as
the food supply or some other factor limits further growth.
The form of the symmetric logistic growth function is:
y = k / (1 + exp(a + b*x))
where 'k', 'a', and 'b' are parameters that shape and scale
the function. The value of 'b' is negative.
The AIDS.NLR example fits a logistic curve to the number of
new cases of AIDS reported in the United States during the
period 1981 through 1992. The computed function fits the data
remarkably well showing that the AIDS infection rate is
following a classic logistic curve and should level off at
about 47,500 new cases per year (in the United States). The
DOMAIN option on the PLOT statement causes Nonlin to
extrapolate the plot of the function through 1995.
F33.NLR -- Multivariate linear regression (multiple regression).
Calculate the value of a used Beech F33 Bonanza airplane using
a linear model based on its age, the number of hours on its
airframe, and the number of hours on its engine. The t value
and Prob(t) indicate that the number of hours on the engine
('Engdep' parameter) is not significant to the regression
48
Chapter 7. Example Analyses 49
model; the other parameters are significant but airframe hours
is less significant than the base price and age of the plane.
F33YEAR.NLR -- Similar to F33.NLR except the price of the Bonanza
is calculated based on a linear function of only the age.
F33EXP.NLR -- Similar to F33YEAR.NLR except a negative exponential
function is used rather than a linear function. Compare the
fit of this model with that of the F33YEAR.NLR example.
SINE.NLR -- Fit an equation involving a sin function. The SWEEP
statement is used to find a starting point that will converge.
TREND.NLR -- Fit a function that has a linear growth term and a
periodic component involving a sin term. See Section 4.9 for
additional information about this example.
SQUARE.NLR -- Fit a sine series to a square wave. Note in this
example that the 'p0' parameter, which represents the constant
term of the equation, has an estimated value of 9.22715E-006
(very nearly zero) and a standard error of 0.0398754. This
yields a t value of nearly zero and Prob(t) of 0.99982 which
means that there is a 99.982% chance that the actual value of
p0 may be zero (it is in fact zero). This illustrates how you
can use the t value and Prob(t) to identify extraneous
parameters.
COOLING.NLR -- Fit an equation involving an exponential function.
If a heated object is allowed to cool, the rate of cooling at
any instant is proportional to the difference between the
object's temperature and the ambient (room) temperature. In
other words, an object cools faster at first, while it is hot,
and the rate of cooling slows down as the temperature of the
object approaches the ambient temperature. The function that
relates the object's temperature to time is:
Temperature = Roomtemp+InitTemp*exp(-Coolrate*Time)
Where InitTemp is the number of degrees above room temperature
at time 0, and Coolrate is a factor that depends on the mass
of the object, how well it is insulated, etc. The exp
function is the value of e (2.7182818...) raised to a power.
The COOLING.NLR example determines the parameters InitTemp and
Coolrate to fit an equation of this form to some data the
author collected.
BOIL.NLR -- The boiling point of water decreases as the pressure
in the vessel containing the water decreases. "Clapeyron's
equation" shows that the boiling point is related to pressure
according to the following function:
Temperature = b / log(Pressure/a) - 459.7
Where 'Temperature' is in degrees Fahrenheit (the 459.7
constant converts degrees Fahrenheit to degrees Rankine --
Chapter 7. Example Analyses 50
relative to absolute zero), 'Pressure' is the pressure in the
vessel in pounds per square inch, and 'a' and 'b' are
parameters whose values are to be determined. The data for
this example was collected by the author's son for a science
project.
MAGNET.NLR -- Fit a function involving an arc tangent and a
variable to the third power. This is an interesting physics
problem. If a magnet is placed due east of a compass, the
deflection of the compass needle from north is equal to the
arc tangent of the ratio of the strength of the magnet's field
relative to the earth's magnetic field. The strength of the
magnet's field at the compass is inversely proportional to the
cube of the distance from the magnet to the compass. Thus,
the function relating these terms is
Deflection = deg(atan(Strength / Distance ^ 3))
The deg function converts an angle in radians to degrees. In
the example, Deflection and Distance are the variables, and
the value of the Strength parameter is determined.
DIODE.NLR -- The current through a diode increases sharply as the
voltage across the diode is increased. An equation that
approximates the current flow as a function of the voltage is:
I = exp(b*(V-c))
where 'I' is the current, 'V' is the voltage, and 'b', and 'c'
are parameters that are to be estimated by the nonlinear
regression.
AVLTIME.NLR -- An AVL tree is a balanced binary tree used to store
information in a computer's memory. Because the entries in an
AVL tree are kept in sorted order, and the tree is kept in a
balanced form, it is possible to rapidly find any entry in the
tree. The time required to create an AVL tree with N entries
is approximately equal to:
Time = a + b*N*log2(N)
where 'a' is a constant term equal to the overhead involved in
starting and completing a tree creation, and 'b' is a growth
coefficient that depends on the speed of the computer. The
log2(N) function is the log base 2 of N (the number of
entries). The AVLTIME.NLR example fits an equation to a data
set that relates the time in seconds required to create an AVL
tree with the number of entries in the tree.
PIECE.NLR -- Piecewise linear function. Fit a function consisting
of two linear pieces that bend at X=5. When X is less than 5,
the slope of the function is B1. When X is greater than or
equal to 5, the slope is B2. B0 is the Y value of the
function at X=5 (i.e., at the pivot point). The IF statement
is used to control which function model is used depending on
Chapter 7. Example Analyses 51
whether the value of the dependent variable is greater than or
less than the pivot point.
Chapter 8
Special Applications
8.1 Omitted Dependent Variable
There is a class of nonlinear regression problems that can be best
expressed by omitting the dependent variable (i.e., the variable
on the left of the equal sign). To understand what this means
first consider the normal regression case with a dependent
variable. For each observation the function is evaluated and the
computed value is subtracted from the corresponding value of the
dependent variable for that observation. This residual value is
then squared and added to the other squared residual values. The
goal is to minimize the total sum of squared residuals. In the
case where the dependent variable is omitted, the function is
computed for each observation and the value of the function is
squared (i.e., it is treated as the residual) and added to the
other squared values. The goal is to minimize the sum of the
squared values of the function. Thus, for a perfect fit the
computed value of the function for every observation would be
zero.
To perform this type of analysis omit the dependent variable and
equal sign from the left side of the function specification.
As an example of this type of analysis consider the problem of
fitting a circle to a set of points that form a roughly circular
pattern (i.e., a "circular regression"). Our goal is to determine
the center point of the circle (Xc,Yc) and the radius (R) which
will make the circle best fit the points so that the sum of the
squared distances between the points and the perimeter of the
circle is minimized (the points are as close to the perimeter of
the circle as possible).
For this problem we have three parameters whose values are to be
determined: Xc, Yc, and R. There will be one data observation for
each point to which the circle is being fitted. For each point
there are two variables, Xp and Yp, the X and Y coordinates of the
point's position.
Since our goal is to minimize the sum of the squared distances
from the points to the perimeter of the circle, we need a function
that will compute this distance for each point. If the center of
the circle is at (Xc,Yc) and the position of a point is (Xp,Yp)
then, from the theorem of Pythagoras, we know the distance from
the center to the point is
52
Chapter 8. Special Applications 53
sqrt((Xp-Xc)^2 + (Yp-Yc)^2)
But we are interested in the distance from the perimeter to the
point. Since the radius of the circle is R, the distance from the
perimeter to the point (along a straight line from the center to
the point) is
sqrt((Xp-Xc)^2 + (Yp-Yc)^2) - R
That is, the distance from the perimeter to the point is equal to
the distance from the center to the point less the distance from
the center to the perimeter (the radius). The distance will be
positive or negative depending on whether the point is outside or
inside the circle but this does not matter since the value is
squared as part of the minimization process.
The Nonlin statements for this analysis are as follows:
Variables Xp,Yp;
Parameters Xc,Yc,R;
Function sqrt((Xp-Xc)^2 + (Yp-Yc)^2) - R;
Note that there is no dependent variable or equal sign to the left
of the function. Nonlin will determine the values of the
parameters Yp, Yc, and R such that the sum of the squared values
of the function (i.e., the sum of the squared distances) is
minimized. The CIRCLE.NLR file contains a full example of this
analysis.
As a second example similar to the first one, consider a town that
is trying to decide where to place a fire station. The location
should be central such that the sum of the squared distances from
the station to each house is minimized. Nonlin can be used to
determine the coordinates of the station (Xc,Yc) given a set of
coordinates for each house location (Xh,Yh) by using a slightly
simpler function than the first example:
Function sqrt((Xh-Xc)^2 + (Yh-Yc)^2);
8.2 Root Finding and Expression Minimization
Although it is designed for nonlinear regression analysis, Nonlin
can also be used to find the root (zero point) or minimum absolute
value of a nonlinear expression. To use Nonlin in this fashion
follow these steps:
. Do not use any VARIABLE statements.
. Use PARAMETER statements to specify the names and optional
starting values for the parameters whose values are to be
determined as the roots or minimum value of the expression.
. Use the FUNCTION statement to specify the expression whose
roots or minimum value is to be found; do NOT specify a
Chapter 8. Special Applications 54
dependent variable and equal sign -- specify only the
expression that is to be minimized.
. Do not include any data records after the DATA statement; it
simply signals the end of the command file and causes the
analysis to begin.
The following is an example command file to find the root of the
expression sin(x)-log(x):
Parameter x;
Function sin(x) - log(x);
Data;
Notice that the "variable" in the expression, X, is not declared
to be a variable but rather a parameter. This example is included
in the file MINSL.NLR which you can run.
For this type of analysis, Nonlin determines the values of the
parameters that minimize the absolute value of the expression. If
the expression has a zero value (i.e., a root), that value is
found since that is the smallest possible absolute value. If the
expression does not have a zero point, Nonlin determines the
values of the parameters that produce the smallest absolute value
of the expression. For example, the expression 2*X^2-3*X+10 does
not have a root but reaches a minimum value of 8.875 when X is
0.75. The MINPAROB.NLR command file contains this example.
There are a number of cautions that you should keep in mind when
using Nonlin to find roots or minimum values:
. Nonlin will find only one root or minimum value per analysis.
For example, the expression 9-X^2 has two roots: -3 and +3.
Nonlin will find one of the roots; which one it finds depends
on the starting value specified for X.
. Nonlin will find only real roots, not complex.
. If the expression contains a local minimum, Nonlin may find it
rather than the global minimum or root. Of course, if you are
looking for a local minimum in a certain region this could be
considered a feature. For example, the expression
0.5*X^3+5*(X-2)^2+15 has a local minimum at X=1.61 and a root
at X=-13.38. If the starting value of X is less than -8.3 the
root is found; if the starting value is greater than -8.3, the
local minimum is found. If the expression contains only a
single variable, use the Mathplot program to graphically
display the expression and determine a good starting value for
the variable (see page 59 for additional information about
Mathplot). The SWEEP statement can also be used to try
multiple starting values when searching for a global minimum.
Chapter 8. Special Applications 55
8.2.1 Function Minimization Examples
MINFALL.NLR -- The time taken for an object to slide down a
frictionless guide from position (0,h) to another position
(d,0) (i.e., falling through a distance 'h' while moving
horizontally a distance 'd') depends on the path that the
object takes as it follows the guide. It turns out that the
path that minimizes the descent time is not a straight line
from (0,h) to (d,0) but rather a curve called a
brachistochrone with a steeper slope near the beginning, that
gives the object a chance to accelerate quickly, and then a
shallower slope further on.
Finding the shape of this curve is a classic problem in the
branch of mathematics called the Calculus of Variations. The
MINFALL example solves a simpler case of this problem: the
object slides along a straight guide from (0,1000) to an
intermediate position (px,py), and then along another straight
guide from (px,py) to (1000,0). What point, (px,py),
minimizes the descent time?
Note concerning the answer: The fall time for the object if it
follows a straight guide from (0,1000) to (1000,0) is 2.0203
seconds; the fall time if it follows the two straight segments
found by MINFALL is 1.8748; the fall time if it follows the
ideal curved brachistochrone is 1.8590. The speed of the
object at the end of the fall is the same regardless of the
path taken due to conservation of energy.
MINFUEL.NLR -- A lunar lander is hovering above the surface of the
moon looking for a suitable landing site. Available fuel is
critical and the desired site is 200 meters away. How long
should the horizontal thruster be fired to start and stop the
motion over the ground? The vertical thruster must be used
continuously to keep the lander from being pulled to the
surface. If too little horizontal thrust is used the
spacecraft will move slowly and much fuel will be consumed by
the vertical thruster counterbalancing the downward
gravitational pull while hovering over the surface. On the
other hand, if the horizontal thruster is fired for a long
time, the spacecraft will move quickly (minimizing the
hovering time) but excessive fuel will be used during the
horizontal acceleration and deceleration. MINFUEL.NLR
determines how long the thruster should be fired during the
start and stop accelerations such that the total fuel
consumption (start thrust + stop thrust + hover) is minimized.
Chapter 9
Acknowledgement and Use of Nonlin
9.1 Acknowledgement
The nonlinear regression algorithm used by Nonlin was published in
ACM Transactions on Mathematical Software 7,3 (Sept. 1981)
"Dennis, J.E., Gay, D.M., and Welsch, R.E. -- An adaptive
nonlinear least-squares algorithm."
9.2 Use and Distribution of Nonlin
There are two versions of the Nonlin program: shareware and
registered. You are welcome to make copies of the shareware
version of Nonlin and pass them on to friends or post this program
on bulletin boards or distribute it via disk catalog services, CD
ROMS, or other means provided the entire Nonlin distribution is
included in its original, unmodified form. A distribution fee may
be charged for the cost of the diskette, shipping and handling.
Vendors are encouraged to contact the author to get the most
recent version of Nonlin.
As a shareware product, you are granted a no-cost, trial period of
30 days during which you may evaluate Nonlin. If you find Nonlin
to be useful, educational, and/or entertaining, and continue to
use it beyond the 30 day trial period, you are required to
compensate the author by sending the registration form printed at
the end of this document (and in REGISTER.DOC) with the
appropriate registration fee to help cover the development and
support of Nonlin.
In return for registering, you will be authorized to continue
using Nonlin beyond the trial period and you will receive a
registered version of the program, a bound typeset manual, and
three months of support via telephone, mail, or CompuServe. Your
registration fee will be refunded if you encounter a serious bug
that cannot be corrected.
The registered version of Nonlin omits the shareware notification
screen at the start of the run and does not require you to press a
key to proceed with the analysis. The registered version also
includes the ability to print plots on HP LaserJet printers. The
registered version of Nonlin is NOT shareware and may not be
redistributed or used on more than one computer system.
56
Chapter 9. Acknowledgement and Use of Nonlin 57
The author frequently improves Nonlin and it is likely that the
version you have is not the most recent version. Note, the cost
of registering Nonlin is insignificant compared with what you
would have to pay to purchase a commercial statistical package
with an equivalent regression capability.
9.3 Association of Shareware Professionals
This program is produced by a member of the Association of
Shareware Professionals (ASP). ASP wants to make sure that the
shareware principle works for you. If you are unable to resolve a
shareware-related problem with an ASP member by contacting the
member directly, ASP may be able to help. The ASP Ombudsman can
help you resolve a dispute or problem with an ASP member, but does
not provide technical support for members' products. Please write
to the ASP Ombudsman at 545 Grover Road, Muskegon, MI 49442 or
send a CompuServe message via CompuServe Mail to ASP Ombudsman
7007,3536.
You are welcome to contact the author:
Phillip H. Sherrod
4410 Gerald Place
Nashville, TN 37205-3806 USA
615-292-2881 (evenings)
CompuServe: 76166,2640
Internet: 76166.2640@compuserve.com
9.4 Copyright Notice
Both the Nonlin program and documentation are copyright (c)
1992-1994 by Phillip H. Sherrod. You are not authorized to
modify the program. "Nonlin" is a trademark.
9.5 Disclaimer
This software and documentation are provided on an "as is" basis.
This program may contain "bugs" and inaccuracies, and its results
should not be assumed to be correct unless they are verified by
independent means. Phillip H. Sherrod disclaims all warranties
relating to this software, whether expressed or implied, including
but not limited to any implied warranties of merchantability or
fitness for a particular purpose. Neither Phillip H. Sherrod nor
anyone else who has been involved in the creation, production, or
delivery of this software shall be liable for any indirect,
consequential, or incidental damages arising out of the use or
inability to use such software, even if Phillip H. Sherrod has
been advised of the possibility of such damages or claims. The
person using the software bears all risk as to the quality and
performance of the software.
This agreement shall be governed by the laws of the State of
Tennessee and shall inure to the benefit of Phillip H. Sherrod
and any successors, administrators, heirs and assigns. Any action
or proceeding brought by either party against the other arising
Chapter 9. Acknowledgement and Use of Nonlin 58
out of or related to this agreement shall be brought only in a
state or federal court of competent jurisdiction located in
Davidson County, Tennessee. The parties hereby consent to in
personam jurisdiction of said courts.
Chapter 10
Other Software
10.1 Mathplot -- Mathematical Function Plotting Program
If you like Nonlin, you should check out the Mathplot program by
the same author.
Mathplot allows you to specify complicated mathematical functions
using ordinary algebraic expressions and immediately plot them.
Four types of functions may be specified: cartesian (y=f(x));
parametric cartesian (y=f(t) and x=f(t)); polar (radius=f(angle));
and parametric polar (radius=f(t) and angle=f(t)). Up to four
functions may be plotted simultaneously. Scaling is automatic.
Options are available to control axis display and labeling as well
as grid lines. Hard copy output may be generated as well as
screen display. Mathplot is an ideal tool for engineers,
scientists, math and science teachers, and anyone else who needs
to quickly visualize mathematical functions.
10.2 TSX-32 -- Multi-User Operating System
If you have a need for a multi-user, multi-tasking operating
system, you should look into TSX-32. TSX-32 is a full-featured,
high performance, multi-user operating system for the 386 and 486
that provides both 32-bit and 16-bit program support. With
facilities such as multitasking and multisessions, networking,
virtual memory, X-Windows, background batch queues, data caching,
file access control, real-time, and dial-in support, TSX-32
provides a solid environment for a wide range of applications.
A two user, shareware version of TSX-32 called TSX-Lite is also
available.
TSX-32 is not a limited, 16-bit, multi-DOS add-on. Rather, it is
a complete 32-bit operating system which makes full use of the
hardware's potential, including protected mode execution, virtual
memory, and demand paging. TSX-32 sites range from small systems
with 2-3 terminals to large installations with more than 150
terminals on a single 486.
In addition to supporting most popular 16-bit DOS programs, TSX-32
also provides a 32-bit "flat" address space with both Phar Lap and
DPMI compatible modes of execution.
59
Chapter 10. Other Software 60
Since the DOS file structure is standard for TSX-32, you can
directly read and write DOS disks. And, you can run DOS part of
the time and TSX-32 the rest of the time on the same computer.
TSX-32 allows each user to control up to 10 sessions. Programs
can also "fork" subtasks for multi-threaded applications. The
patented Adaptive Scheduling Algorithm provides consistently good
response time under varying conditions.
The TSX-32 network option provides industry standard TCP/IP
networking through Ethernet and serial lines. Programs can access
files on remote machines as easily as on their own machine. The
SET HOST command allows a user on one machine to log onto another
computer in the network. FTP, Telnet, and NFS are available for
interoperability with other systems.
System requirements: 386 or 486 system, 4MB memory, 12MB of free
disk space (Stacker and DoubleSpace are not supported).
TSX-32 is, quite simply, the best and most powerful operating
system available for the 386 and 486. For additional information
contact:
S&H Computer Systems, Inc.
1027 17th Avenue South
Nashville, TN 37212 USA
615-327-3670 (voice)
615-321-5929 (fax)
CompuServe: 71333,27
Internet: 71333.27@compuserve.com
10.3 SIMSTAT -- Interactive Statistics Program
If you need a general-purpose statistical package, or would like a
menu-oriented, mouse-aware interface to Nonlin, I suggest you try
the SIMSTAT program written by Normand Peladeau.
SIMSTAT is a menu driven statistical program that provides many
basic descriptive and comparative statistics and includes a
"bridge" module to allow it to function as a "front end" and data
editor for Nonlin.
The shareware version of SIMSTAT is available from the IBMAPP
forum of CompuServe and from many BBS, or you can contact the
author at the address below. SIMSTAT version 3.0 or later is
required for use with Nonlin. You must also have the "bridge"
module called SIM2NL. SIM2NL.ZIP is included on the distribution
disk with the registered version of Nonlin.
For information about SIMSTAT, contact: Normand Peladeau, Provalis
Research, 5000 Adam Street, Montreal, QC H1V 1W5, Canada,
Compuserve: [71760,2103], Internet: 71760.2103@compuserve.com.
===============================================================
Software Order Form
===============================================================
Name ______________________________________________________
Address ___________________________________________________
City _______________________ State _______ Zip ___________
Country ____________________ Telephone ___________________
Internet address (optional) _______________________________
Nonlin version ____________________________________________
Bulletin board where you found Nonlin _____________________
Comments __________________________________________________
Check the box below which indicates your order type:
___ I wish to register Nonlin ($45).
___ I wish to order Mathplot ($20).
___ I wish to register Nonlin and order Mathplot ($60).
Add $5 to any amount shown above if the software is being shipped
out of the United States. I cannot accept checks from non-US
banks. Visa, MasterCard and American Express credit card charges
are accepted but a check, money order, or cash is preferred. If
you wish to use a credit card specify the billing name, address,
card number, and expiration date.
In return for registering, you will receive the registered version
of the program, a laser-printed, bound copy of the manual, and
three months of telephone or CompuServe support. Your
registration fee will be refunded if you find a serious bug that
cannot be corrected.
Distribution disk choice (check one):
3.50" HD (1.4 MB) ______
5.25" HD (1.2 MB) ______
5.25" DD (360 KB) ______
Send this form with the amount indicated to the author:
Phillip H. Sherrod
4410 Gerald Place
Nashville, TN 37205-3806 USA
615-292-2881 (evenings)
CompuServe: 76166,2640
Internet: 76166.2640@compuserve.com
Index 62
80x87 coprocessor, 47 CONTINUE statement, 33, 35
ABS function, 11 Convergence criterion, 43
Absolute converge, 44 Convergence failures, 45
Acknowledgement, 56 Cooling example, 49
ACOS function, 11 Copyright notice, 57
Adaptive algorithm, 43 CORRELATE statement, 22, 41
AFCTOL value, 43 Correlation matrix, 41
AIDS growth curve, 48 COS function, 11
AND operator, 10 Cosecant function, 11
ANGLETYPE statement, 23 COSH function, 11
Arc cosine function, 11 COT function, 11
Arc sine function, 11 COVARIANCE statement, 22
Arc tangent function, 11 CSC function, 11
Arithmetic operators, 9 CTOP function, 11
Arrays DATA statement, 35
declaration, 19 DEG function, 12
initialization, 19 Degrees for functions, 23
storage order, 19 Degrees to radians, 14
subscripts, 20 Dependent variable, 7
ASIN function, 11 Deviation
ASP, 57 average, 38
Assignment operators, 10, 32 definition, 2
Assn. of Shareware Prof., 57 maximum, 38
Asymptotic function example, Diode current example, 50
48 Disclaimer, 57
ATAN function, 11 DO statement, 33
Author address, 57 DOMAIN option, 25, 28, 29
Autoregression test, 30, 39 DOUBLE statement, 19
Average deviation, 38 DPMI support, 59
AVL tree example, 50 Durbin-Watson statistic, 39
Bessel function, 13, 15 EI2 function, 12
Beta function, 11 EIC1 function, 12
BETAI function, 11 EIC2 function, 12
Brachistochrone, 55 EL1 function, 12
BREAK statement, 33, 34 Elliptic integral function,
Build-in functions, 11 12
Built-in constant, 11 ERF function, 12
Calculus of variations, 55 Examples, 48
CEIL function, 11 AIDS growth curve, 48
Chebyshev function, 15 asymptotic function, 48
Circular regression, 52 AVL tree, 50
Clapeyron's equation, 49 boiling water, 49
Comma operator, 10 circular regression, 52
Command files, 16 cooling, 49
Commands, 18 diode current, 50
Comments, 16 fire station location, 53
Comparison operators, 10 function minimization, 55
Compass to polar, 11 linear regression, 48
CompuServe, 57, 60 lunar lander, 55
Conditional operator, 10 magnet force, 50
Confidence intervals, 22 minimum time path, 55
CONFIDENCE statement, 22, 38 multivariate, 48
CONSTANT command, 20 negative exponential, 49
Constant variables, 7 piecewise function, 50
CONSTRAIN statement, 20, 46 quadratic equation, 48
Index 63
square wave, 49 Magnet example, 50
SWEEP statement, 49 Marquardt algorithm, 43
EXP function, 12 Mathplot, 59
Exponentiation operator, 9 MAX function, 13
Expression minimization, 53 Maximum values, 37
FAC function, 12 MIN function, 13
Factorial function, 12 Minimization algorithm, 43
Fire station example, 53 Minimization problem, 53
FLOOR function, 12 Model/trust region, 43
FOR statement, 34 Modulo operator, 9
Function minimization, 53 Multi-user operating system,
FUNCTION statement, 21 59
Functions, 11 Multiple determination, 39
GAMMA function, 12 Multivariate regression, 48
GAMMAI function, 12 Mutually dependent, 46
GAMMALN function, 12 Natural log function, 13
Gauss-Newton algorithm, 43 Negative exponential, 2, 49
Growth curve, 48 Networking, 60
HAV function, 12 NOECHO statement, 32
Hessian, 43 NONLIN environment variable,
Hyperbolic cosine function, 4
11 NONLIN.DOC file, 4
Hyperbolic sine function, 15 NONLIN.EXE file, 4
Hyperbolic tangent function, NONLIN.FON file, 4, 25
15 NONLIN.LJF file, 4
IF statement, 32 NORMAL function, 13
Include files, 16 Normal probability plot, 30
#INCLUDE statement, 16 NOT operator, 10
Incomplete beta function, 11 NPD function, 13
Independent variables, 7 NPLOT statement, 8, 30
Input variables, 6 autocorrelation test, 40
Installing Nonlin, 4 options, 30
INT function, 13 Numeric constants, 10
Internet, 57, 60 Numeric coprocessor, 47
Inverse gamma function, 12 OBS system variable, 7, 24
ITERATIONS statement, 23 Operators
J0 function, 13 assignment, 10
J1 function, 13 comma, 10
JN function, 13 comparison, 10
Keywords, 6 conditional, 10
LaserJet printer, 4, 56 logical, 10
resolution, 31 precedence of, 10
Least squares regression, 2 subscript, 10
Levenberg-Marquardt, 43 OR operator, 10
Limits, 47 Order form, 61
Linear regression, 1, 3 OUTPUT statement, 24
example, 48 PARAMETER statement
Listing file, 5 function minimization, 53
LOG function, 13 PARAMETERS statement, 18
Log gamma function, 12 PAREA function, 13
LOG10 function, 13 Performance issues, 46
LOG2 function, 13 Periodic data, 40
Logical operators, 10 PI constant, 11
Logistic curve, 48 Piecewise function example,
Lunar lander example, 55 50
Index 64
PLOT statement, 7, 24 RPLOT statement, 8, 28
options, 25 autocorrelation test, 40
Plots options, 29
data and function, 25 RTOPA function, 15
normal probability, 30 RTOPD function, 15
residual values, 28 R^2 statistic, 39
resolution, 31 S&H Computer Systems, Inc.,
scatter, 26 60
width of, 31 Scatter plots, 26
Polar to compass, 14 SEC function, 15
Polar to rectangular, 14 Secant function, 15
Polynomial equation example, SEL function, 15
48 SET command, 4
POUTPUT statement, 24 Sherrod, Phillip H., 57
extra precision output, 37 SIM2NL, 60
Precedence of operators, 10 SIMSTAT, 60
PREDICTED system variable, 6, SIN function, 15, 40
8, 9, 24 Singular matrix problems, 46
PRESOLUTION statement, 31 SINH function, 15
PRINTF function, 13 SPLOT statement, 7, 26
Prob(t) value, 38 options, 26
Program limits, 47 SQRT function, 15
PTOC function, 14 Square wave example, 49
PTORX function, 14 Standard deviation, 37
PTORY function, 14 Standard error function, 12
PULSE function, 14 Statements, 18
Quadratic equation example, STEP function, 15
48 STOP statement, 35
RAD function, 14 Subscript operator, 10
Radian to degree conversion, Subscripts, 20
12 Support of Nonlin, 56
Radians for functions, 23 SWEEP statement, 21
RANDOM function, 14 convergence failure, 45
Ra^2 statistic, 39 example, 49
Real-time, 59 function minimization, 54
Rectangular to polar, 15 performance issues, 47
REGISTER.DOC file, 4 Symbolic constants, 7, 11, 20
Registering Nonlin, 56 System variables, 6
Registration form, 61 T function, 15
Relational operators, 10 t statistic, 37, 38
Relative convergence, 44 TAN function, 15
Remainder operator (modulo), TANH function, 15
9 TCP/IP, 60
Reserved words, 6 Theory of operation, 43
Residual Time series data, 39
RESIDUAL system variable, 6, TITLE statement, 18
8, 24 TOLERANCE statement, 23
Residual values converge criterion, 43
plotting, 28 Transformed function, 3
average, 38 TREND example, 40
definition, 2 TREND.NLR, 49
maximum, 38 Trigonometric functions
RFCTOL value, 43 degrees or radians, 23
Root finding, 53 TSX-32, 59
ROUND function, 15 TSX-Lite, 59
Index 65
Use and distribution, 56
VARIABLES statement, 18
Variance-covariance matrix,
22
Vectors
see Arrays, 19
Warranty, 57
WHILE statement, 33
WIDTH statement, 31
X windows, 59
Y0 function, 15
Y1 function, 15
YN function, 15